W N 05B000016 W ORKING NOTE Event Correlation Languages and Engines March 2005 Frederick N. Chase Sponsor: Department: U.S.A.F. G026 Contract: Project: FA8721-05-C-0001 0305615A This document is intended for internal use and is not an official position of The MITRE Corporation. © 2005 The MITRE Corporation Center for Integrated Intelligence Systems Bedford, Massachusetts Executive Summary Event correlation is at the heart of Security Information Management (SIM) products. The real-time search, by computer, for situations easily expressed in English, still eludes the current generation of SIM products, which are unable to simultaneously minimize both false positives and false negatives. Normalization of event records from various sensors is at a primitive level: there is no inter-vendor standard for normalized events and existing vendor solutions will not generalize. A plausible medium-term research goal is to develop a proposed standard event language for normalized events, using existing ideas from taxonomy, ontology, and the semantic web. Offering a highly expressive and performant event correlation language is still an artificial intelligence challenge unlikely to be attained soon, but it is a plausible research goal to move to the next level in event correlation language power. This working note captures considerations relevant to these two research goals. iii Acknowledgments Special thanks go to Ariel Segall for pulling together the ‘MITRE’ natural correlation languages (weighted and non-weighted). iv Table of Contents Section Page 1 Introduction 1 Background 3 Natural Correlation Languages 3.1 Non-Weighted Rule Pseudo code 3.2 Weighted Rule Pseudo code 5 5 6 Existing Correlation Languages 4.1 netForensics 4.2 e-Security 4.3 neuSECURE 4.4 Network Security Manager 4.5 Security Threat Manager 4.6 ArcSight 4.7 ABLE 4.8 JESS 4.9 Chronicles 4.10 Notes on Additional Languages and Engines 4.10.1 ZCE (Zurich Correlation Engine) 4.10.2 Blaze 4.10.3 Spectrum 4.10.4 Nerve Center 4.10.5 TEC (Tivoli Enterprise Console) 4.10.6 Tivoli ITM 4.10.7 ILOG 4.10.8 InCharge 4.10.9 eAutomation 4.10.10 EventWatch 4.10.11 AMIT 4.10.12 Yemanja 4.10.13 ART*Enterprise 4.10.14 NetExpert v 9 10 10 10 11 11 12 12 15 15 15 15 16 16 16 17 17 17 18 19 19 19 19 20 20 Section 4.10.15 4.10.16 4.10.17 4.10.18 Page NetCool Versata ODE Snoop 20 20 21 21 Event Normalization 23 The Correlation Language Spectrum 25 Conclusion and Recommendations 33 List of References 35 Context Table for Logic Language Spectrum 37 Glossary 41 vi List of Figures Figure 1 Page Logic Languages 31 List of Tables Table Page 1 ABLE RL Rule Types 13 2 ABLE Inference Engines 14 3 Expressivity Attributes of a Logic 28 vii Section 1 Introduction This working note collects information available in 2004/2005, on the status of engines intended to interpret and react, in near-real time to network system events. The information grew out of the INSITE (Lighthouse SIM) task. Although our particular interest has been in attack events, it develops that event correlation engines used in other fields are also suitable for management of security information. For example, credit card use policies, critically-ill patient management, and network fault identification all have a similar need for event correlation and solutions to be available. They also appear to be relevant to security event correlation. This note captures the information we accumulated about the event messages, the correlation engines, and the languages used to drive the engines. 1 Section 2 Background In our late-2004 evaluation of Security Information Management (SIM) systems, we are finding them to have a somewhat hollow center for handling the streams of sensor events collected across an enterprise IT network. Typically, there is an extensive report generation capability, but a notable lack of capability to identify new event patterns or to highlight known multi-event syndromes, whose symptoms collectively indicate urgent need for human attention. SIM systems having a capability for event correlation generally lack a language specification and performance guidelines so that the capability can be evaluated. Instead, a GUI with drag-and-drop event references, which can be connected by ’and’ and ’or’ logic is representative of SIM systems. An extended ability in this area can reasonably be expected to improve SIM performance, as measured by the simultaneous reduction of both false positives and false negatives while using a single set of system sensitivity settings. Since exposing the power of the language used to identify known multi-event syndromes is central to evaluating a SIM system, this paper explores the space of such languages. As discussed in [Eckmann 2000], it is possible to identify at least six classes of attack languages: • event languages • response languages • reporting languages • correlation languages • exploit languages • detection languages The central focus of an event language is the sensor-detected event. The central feature of the language is the characterization of the format of individual event records, possibly in a natural language or by using a database schema to describe an event relation Table 1. Typically a successful parse of an event language will simply accept zero or more event records. 3 Examples of event languages are tcpdump packets, and syslog messages. No common event language currently is in widespread use: security information management systems commonly translate events in a wide variety of formats into a proprietary form, often termed normalized form, for which documentation may or may not be freely available. A response language specifies actions to be taken in reaction to the detection of attack manifestations. Most IDSs currently do not have a well-defined response language. Instead they support invocation of shell scripts and possibly invocation of library functions written in general purpose languages such as C or Java. Reporting languages are used to describe alerts containing information about an attack, such as source of the attack, target of the attack, type of the attack (if known), events that are part of the manifestation, etc. Eckmann Vigna and Kemmerer note that a reporting language could also be used as an event language at a higher level. Two examples of proposed standardized reporting languages are the Common Intrusion Specification Language (CISL) and the Intrusion Detection Message Exchange Format (IDMEF). Correlation languages react to multiple related events, possibly from a variety of sensors. As an adjunct, they may include asset valuation information allowing them to prioritize the alerts they emit. Exploit languages are used to describe the steps to be followed to perform an intrusion. They are usually executable general-purpose languages such as C, C++, Perl, Tcl, Bourne Shell, Python, etc. There are also languages that are explicitly designed to support the scripting of attacks. Detection languages are designed to support intrusion detection, and they are usually referred to as attack languages. These languages provide mechanisms and abstractions for identifying the manifestation of an attack. The description of an attack is in a form that can be processed by the language runtime and can be matched against a stream of input events, looking for evidence that an instance of an attack is occurring. See [Eckman 2000], from which the above is drawn, for a more extended survey of these languages. 4 Section 3 Natural Correlation Languages It is startling how well a controlled use of natural language expresses the intent of a rule block. For example, a HTTP exfiltration might be specified as follows: Generate an alert when the browser/website data ratio is unusually large unless an authorized SOAP protocol and an authorized ‘web site are involved. Bias towards alerting if, over time, a given browser sends an unusually large aggregate amount of data to web servers, particularly if it’s done using GETs. This section describes in detail two similar rule pseudo code languages that we developed with the intention that they be user-friendly but plausible to implement.1 One language uses weighted rules but in the other there are no weights. 3.1 Non-Weighted Rule Pseudo code (rule-name ((assign varname1 template1) ... (assign varname10 template10)) (and (test1 (template1 template2 template3)) ... (test10 (template10))) ((action1) ... (action10))) Non-weighted rules have three main parts: variable assignments, tests, and actions. Variable assignments allow us to refer to a particular assertion that corresponds to a desired template repeatedly and easily. Tests are arbitrary comparisons, with one or more arguments, that return a Boolean value. Tests can be nested, and the section containing the tests must output a single Boolean value. Actions are things which will take place if the Boolean returns true, such as firing an alert, adding or removing assertions from the database, etc. Template: A template is a pattern which will be compared to assertions in the database. When the correlation engine tests a rule, templates will be filled in with matching assertions 1 Prototype implementations (in Prolog) for the MITRE weighted rules were coded. 5 to determine if the rule will fire. When describing what will happen when the rule is evaluated, template and assertion are equivalent; a written rule contains templates as function arguments, but a rule being evaluated passes assertions. Variable assignments: Sometimes, when looking for a database assertion to match, it is convenient to only write the information once, and have a pointer to the results. This is particularly true when comparing components of different assertions, or when passing the matched assertion to functions. Variable assignments make this possible. Since we may have statements where not all templates need to be true (or being the simplest case), variables with templates not matching anything in the database are valid but will always be considered false. For ease of function writing, they will return the empty assertion, (). Tests: The simplest tests are the basic Booleans: ands, ors, and nots. However, we may wish to create more complex comparisons, which might require reasoning based on attributes of a given assertion (such as the time or source IP address) or relationships between assertions. Any function which takes one or more templates, variables, or Boolean values and returns a Boolean is acceptable. Tests can be nested, since they all resolve to Booleans. Tests must be combined (usually with a top-level and statement) so that there is a single Boolean result for the rule. From the perspective of the rule engine, there can only be one test statement, but it can be an arbitrarily complex function. The rule will fire when the toplevel test statement returns true. Actions: When the rule fires, any action statement will be implemented. Actions include sending an alert, adding assertions to the database, removing assertions from further consideration, etc. Actions can be conditional; for example, additional alerts could be fired if a particularly sensitive system was involved. 3.2 Weighted Rule Pseudo code (rule-name ((assign var1 template1) ... (assign var10 template10)) (combination-function func) (((testA var1) weightA) ... ((testC var2 var3 var4) weightC)) ((threshold value1 response action1) (threshold value2 response action2))) A weighted rule, unlike a non-weighted rule, has four parts: variable assignments, a combination function, a series of tests with weights, and threshold-based actions. Every test must return a Boolean value; if the value is true, then the associated weight is added to 6 the numbers the combination-function will consider. A weighted rule can be modeled as a non-weighted rule with a single non-Boolean test, the combination function, and a series of conditional actions based on the output of that function. Test: As in a non-weighted rule, a function which returns a Boolean based on any number of inputs, which will usually be templates or variables. The simplest possible test is for the presence of an assertion in the database. Weight: A numerical value between -1 and 1 which will be passed to the combination function if the associated test returns true. Combination Function: A function of a series of numbers, which returns a numerical value. The simplest combination function is addition; another common function treats weights as probabilities and multiplies the weight and (1-current value) and adds that to the total, so that the result approaches 1 but only reaches it if some fact has a weight of 1, implying certainty. Threshold: The threshold is a numerical value which is compared to the output of the combination function to determine an action. Positive threshold values trigger their associated action if the rule total is above the listed value. Negative threshold values trigger their associated action if the rule total is below the listed value. Thresholds fire in the order they are listed, so unusual threshold values should be listed before more common ones. Action: Actions include adding an assertion to the database (caching), removing an assertion from the database, firing an alert, or canceling processing on this rule. Actions can be combined, and can include conditionals. 7 Section 4 Existing Correlation Languages In this section we are concerned primarily with correlation languages. We take sensorgenerated events and the event languages as we find them and presume that a trace (a stream of normalized and time-sequenced events) is available for analysis. We first describe event correlation languages implemented and available either within a SIM product or as separate commercial products. Then we describe several of the more promising languages for which we’ve been able to obtain sufficient documentation. Finally, we note languages which had been slated for investigation. Of interest is the language power: it should be possible to define any reasonable combination of events. But a low-level language using a large number of complex application programming interfaces qualifies in this regard: it’s also necessary to consider elegance and non-redundancy. So in the following subsections, we sometimes comment on the ’ratio’ of power to complexity: a high ratio of power to complexity is good. It should also be possible to inject an identified correlation as a secondary event. The injection may either be into the sensor trace or into another, more abstract, trace also being processed by the same or a different event correlation engine. Some of the SIM products include an asset-valuation aspect to their correlation language. Asset valuation assists in alert prioritization but will imply capability to interoperate or duplication of effort when an enterprise also uses a separate risk analysis process. In addition, we postulate the following model for interaction between SIM system and Security/Network Operations Center (SOC/NOC) personnel. For real-time management we assume a single scrolling console below which is all the enterprise’s computerized and automated intelligence and above which is the entire staff of SOC/NOC personnel who accept notifications of event-syndrome instances needing human investigation. A message that scrolls by is termed an alert. A false positive is an alert that, upon evaluation, would better not have appeared on the scrolling console. A false negative is an alert that would have been beneficial if present on the console. For overall status assessment, we assume in addition a dashboard with an ongoing prioritized list of concerns generated over a longer system history, perhaps on a statistical basis. Prioritization will need minimally to be based on recency and perhaps on a SOC/NOC personnel-mediated triage process. Alerts from detection of low-and-slow activity (trickle scans and the like) will appear in this list. 9 4.1 netForensics A rule in the netForensics Rules Based Correlation language is a graphical tree with the root Entry State node on the left and the leaves on the right. The Entry State references a primary event which acts as an initial indicator of a potential anomaly. The leaves of the tree on the right are nodes for which no further analysis is needed. Within the tree, multiple nodes to the right of a given node indicate a logical OR and a string of two nodes, one to the right of the other indicates a logical AND2. Every node has a timeout after which a firing instance of a particular node is nullified. Every node has an action. This action can be the generation of a secondary event. Normally only the leaf nodes result in an alert action. 4.2 e-Security The e-Security Products by e-Security, Inc. (http://www.esecurityinc.com) and Crystal Decisions use a parallel, distributed, tiered, in-memory3 correlation engine architecture. Event normalization is based on e-Security’s event taxonomy. Each correlation engine searches for significant patterns, usually within certain timeframes. More than 125 correlation rules are available out of the box. A Rule Wizard collection allows definition of rule expressions based on regular expression or aggregation of a number of similar primary events within a timeframe. Their free form RuleLg correlation rule definition language4 is reportedly more general. 4.3 neuSECURE GuardedNet’s neuSECURE uses impact correlation, statistical correlation, and rulesbased correlation. Universal agent software is not installed on the sensors (except for Windows). When installed, it does not perform event logic: event correlation is centrally performed. Risk is addressed by impact correlation. For a primary event, a multivariate algorithm considers many values in real time, including user-defined (tunable) asset value, threat source importance, event severity, and event validity. Watchlists identify either high-value IP addresses or blacklisted IP addresses. 2 A general language issue is whether the two events occurring in the opposite temporal sequence are equivalent, or whether that equivalence must be diagrammed explicitly. More generally, a language should be able to easily indicate that several events are necessary, but need not occur in a particular order. 3 Vulnerability, patch management, configuration data, asset valuation, and other relatively static “referential data” is stored in databases: it is distributed to subscribing correlation agents. 4 This is the only SIM for which a language for rule definition is claimed. 10 GuardedNet’s notion of susceptibility refers to asset value together with Common Vulnerability Enumeration (CVE)-like host configuration information. Both susceptibility and impact correlation are used to prioritize alerts. The statistical correlation engine is fast and also performs in real-time. This statistical correlation distinguishes neuSECURE from most other SIM products. Statistical correlation combines the atomic threat value determined by impact correlation and correlates that with trending for suspicionable hosts and critical destinations. The rules-based correlation functions as an adjunct to the statistical correlation. A GUI interface supporting definition of a rule characterizes neuSECURE rules: there is no event correlation rules language. For normalization, a common event taxonomy includes, for example, the value fw.accept (firewall accept). Altogether, about 15000 types of events are grouped into about 200 event classes. Apparently also, rules can depend on an unnormalized specific agentless conduit (SNMP, syslog, SMTP, CheckPoint, OPSEC, Cisco SecurePOP,) When a rule matches relevant fields of a normalized raw event, it invokes backward chaining to match relevant previous events. A rule can fire based on regular expression matching and, apparently, on database lookup. When the firing represents an alert, an Action of any sort can be invoked. If the entire rule fires, it optionally inserts a meta-event into the normalized input stream. A rule which fires only in the presence of this meta-event termed a meta-rule. 4.4 Network Security Manager The Network Security Manager by Intellitactics has a GUI where operators are dragged onto the rule and dropped into position to visually form a Boolean expression. Each operator is Java based and extensible. Database and regular expression support is evidently good. Almost nothing is available on the Internet as of Nov 2004 concerning Intellitactics rules. 4.5 Security Threat Manager Security Threat Manger (STM) by OpenService ships pre-configured rule sets for each third party event collector. Apparently, they execute on the collector. Rule sets can be added or modified using a browser-based rules editor (rules programming application). Each rule in a set has 3 elements. The Activation precondition is an event or a named trigger. When activated, the rule Match determines whether the rule fires by using a regular expression match condition, a count, and a timeout. Matching results in an Action which is a named trigger or an alert. A rule set supports Boolean OR by using two or more rules both having an output action which is an Activation condition of another rule. A predicate cannot be computed from a database lookup. However, the separate Low-and-Slow application is database-oriented. Rules are processed, using the forward chaining principle, in the collector, rather than centrally. STM uses a well-defined and documented Standard Log Format (SLF). 11 STM also allows the user to enter values for threat, vulnerability, and asset (IP address) value which then influence the alert level. When an alert fires, it can provide an automated response of any sort by virtue of having an associated Action Hook Bundle. 4.6 ArcSight ArcSight, by the ArcSight Corporation, uses Java SmartAgents running almost anywhere5 to normalize and aggregate syslog, OPSEC, and many other heterogeneous events into a normalized ArcSight message. ArcSight can report, using a SmartAgent, OS events other than those in system log files involving file creation, account creation, privileged command execution, and other HIDS-oriented events. ArcSight expects a customer to use hundreds or even thousands of rules. Hundreds are shipped with the product and others are created by using its GUI driven Rules Editor or by writing XML.6 The rules language uses simple logical operators such as AND and OR. An example rule is “If (an ids evasion attack) occurs (from the same source IP address) (3 times) within (2 minutes) then (send message to console) and (notify the security supervisor via pager)”. The rules seem to be database-oriented: they can be based on any attribute in the ArcSight Schema.7 ArcSight TruThreat correlation uses asset values and collected vulnerability data to prioritize alerts based on risk. 4.7 ABLE ABLE (Agent Building and Learning Environment) is a framework, component library, and productivity tool kit for building intelligent agents using machine learning and reasoning. ABLE is very tightly integrated with Java classes and objects. The body of a rule can be one of several different types. The basic rule body types are given in Table 1. ABLE RL Rule Types. 5 They run on a JVM. A "flexible scripting language that allows expression of SmartRules" presumably describes the same capability. 7 There is apparently a buffered, optimized, in-memory aspect of the database for just-arrived-andnormalized events. If this is correct, then this would provide evidence of use of a sliding-window concept. 6 12 Table 1. ABLE RL Rule Types Type Syntax Comment Assertion Assertions are simply assignment statements. If-Then (Inference) When-Do While-Do Do/until Do-While if ( <Boolean expression> ) then { <Action expression>+ } when ( <patternMatchClause>+ ) do { <actionExpression>+ } while ( do { <Action } do { <Action } until do { <Action } while <Boolean expression> ) expression>+ expression>+ ( <Boolean expression> ); expression>+ ( <Boolean expression> ); The action is evaluated at least once. The action is evaluated at least once. Predicate <predicateName> ( <arg>* ). Fact Predicate <predicateName> :- <predicate>+ | <Boolean expression>+ . Rule The basic data types are words, identifiers, literals, numbers, lists, and variables. Java built-in data types such as Boolean, Integer, and Double are also found in ABLERL. Altogether, ABLE-RL supports the following built-in data types: Categorical, Continuous, Discrete, Fuzzy, TimePeriod, numeric, object, Boolean, Integer, Double, Number, and String. ABLE-RL also includes user-defined data types (i.e., Java classes imported to the ruleset). Arrays of these data types and the Static attribute are supported. In addition, the following hedges can be applied to a fuzzy set in a fuzzy clause: About, Above, Below, CloseTo, Extremely, Generally, InVicinityOf, Not, Positively, Slightly, Somewhat, Very. The ABLE RuleSet Editor provides syntactic and semantic checking of rules and a test and debug environment. 13 The ABLE framework library includes ’AbleBeans’ for rule-based inferencing using Boolean and fuzzy logic, and for machine learning techniques. Rule sets created using the ABLE Rule Language can be used by any of the provided inferencing engines, which range from simple if-then scripting to light-weight inferencing to heavy-weight AI algorithms using pattern matching and unification. A number of ABLE rule-block processors (pluggable inference engine beans) are implemented, as described in Table 2. ABLE Inference Engines Not all rule block and processor combination is valid. Table 2. ABLE Inference Engines Capability What It Processes Boolean forward chaining. implications (if-then rules) using forward chaining. Boolean backward chaining. if-then rules using backward chaining. Fuzzy forward chaining. if-then rules containing linguistic variables and hedges and several types of fuzzy sets, and supports multistep chaining. Pattern Match engine. when-do pattern match rules using forward chaining against a working memory. Pattern Match network. when-do pattern match rules using the Reté network forward chaining algorithm against a working memory. Predicate engine. predicate rules using a back chaining algorithm with backtracking (similar to Prolog). Scripting engine. assignments, if-then, if-then-else, while-do, and do-while rules in sequential order. Java objects can be created and manipulated using ABLE rules. User-defined functions can be invoked from rules to enable external data to be read and actions to be invoked. Material in this section has been drawn from: http://www.alphaworks.ibm.com/tech/able . 14 The ABLE Rule Language is defined in HTML at: http://www.research.ibm.com/able/doc/reference/com/ibm/able/rules/doc-files/arlIndex.html . The ABLE Rule Language User’s Guide and Reference Version 2.1 is at: http://www.research.ibm.com/able/doc/reference/com/ibm/able/rules/docfiles/ABLERuleLanguage.pdf . A tutorial on ABLE as ’AI in Java’ is at: http://www.devdaily.com/java/AI/ . 4.8 JESS JESS is a forward-chaining rule processing system descended from OPS5 [Cooper 1988]. JESS is used in the MITRE State Predicted Interference Cancellation and Equalization (SPICE) application which checks network traffic against a policy, such as the DoD Ports, Protocols, an Services standard. The principal reference is http://herzberg.ca.sandia.gov/jess/docs/61/intro.html . 4.9 Chronicles Chronicles [MorDeb 2002] is a promising multi-alarm misuse correlation language developed at France Telecom in the late 1990s to more intelligently report related outages of telecommunication components. An implementation, the Chronicle Recognition System, is available at http://crs.elibel.tm.fr . The Chronicles language is nicely described at http://crs.elibel.tm.fr/en/manuals/index.mhtml . Attribution of impact value and therefore assessment of risk is not part of chronicles, per se. 4.10 Notes on Additional Languages and Engines At the time the project was shelved, notes as follows had been accumulated on several additional correlation capabilities slated for investigation. 4.10.1 ZCE (Zurich Correlation Engine) http://www.zurich.ibm.com/csc/infosec/gsal/projects/zce/ The Zurich Correlation Engine (ZCE) is a compact, Java-based, fast real-time correlation engine. It supports a wide range of correlation requirements with maximum performance. Its unique rule replication function allows a single rule to automatically handle multiple instances of the same event signature. Its compact size makes it possible to deploy multiple, 15 distributed correlation engines in an enterprise, allowing scalable correlation. As implemented in Tivoli Risk Manager, it correlates security information and risk alerts from firewalls, routers, networks, host- and application-based detection systems, desktops, and vulnerability scanning tools. APPARENT RELEVANCE: High. 4.10.2 Blaze Amit vs. Blaze (according to Amit!) Blaze Advisor (Brokat) supports simple 'IF … THEN … ELSE' rules. It was used in the WebSphere Commerce Suite and in the Product Advisor, in ibm.com. Blaze is an advisor Structured Rule Language (SRL) - a natural, English-like language. Advisor rules can be written against true objects such as Java, CORBA or COM/ActiveX, but Advisor can have rules written against database rows mapped as dataonly objects. Blaze does not support events, and does not support a dynamic approach. It is basically based on a decision tree, and is DB-oriented in principle. APPARENT RELEVANCE: Medium. 4.10.3 Spectrum Aprisma's Spectrum APPARENT RELEVANCE: TBD. 4.10.4 Nerve Center NerveCenter by Veritas is less packet-centric: its orientation is toward the host and its applications. http://www.veritas.com/products/nervectr/ . Amit vs. Veritas (according to Amit!) Veritas NerveCenter is also a system network management tool that correlates network events. When a predefined network condition is detected, Veritas stores the event information in a finite state machine called an alarm. The alarm continues to track the status of the object being monitored. To correlate and filter this data, Veritas relies on configurable models of network and system behavior, called behavior models, for each type of managed resource. 16 As in the case of InCharge, the event correlation in the Veritas system is designed to handle only network events, and it does not provide a general, domain-independent solution. APPARENT RELEVANCE: can't be sure. 4.10.5 TEC (Tivoli Enterprise Console) Prolog is used to customize TEC event correlation rules. August 1998: http://www.internetweek.com/news/news080798-5.htm The problem, according to Tivoli users and integrators, is that although the TME software is a great collector of management information, it does not do much to help IT administrators quickly isolate the source of a glitch. As a result, some Tivoli users are inundated with network alerts--sometimes called events--and must use their own ingenuity to differentiate the problem's origin from its symptoms. Event correlation tools, such as SMARTS' InCharge, consolidate event information collected around the enterprise and automatically winnow out the pertinent data for the IT manager. The Tivoli Enterprise Console (TEC) can do some simple correlation, but many users and other experts say it doesn't do enough. APPARENT RELEVANCE: Medium-high. 4.10.6 Tivoli ITM ITM event correlation (IBM Tivoli Monitoring - event correlation) IBM's overall umbrella product is the Tivoli Management Environment 10 (TME 10) framework. GOOGLE: ITM event correlation APPARENT RELEVANCE: Can't get info. 4.10.7 ILOG ILOG http://www.ilog.com/products/ http://wwwmnmteam.informatik.uni-muenchen.de/projects/evcorr/ GOOGLE: ILOG event correlation 17 There is no easier way to integrate a rule engine into your Java application than ILOG JRules. http://www.ilog.com/products/rules/whitepaper.pdf ILOG Business Rule Studio lets you use the Eclipse IDE to embed rules into Java/J2EE applications. BR Studio is the first business rule authoring, testing and debugging environment for Eclipse -- available FREE for download! APPARENT RELEVANCE: High, but these are 'business' rules. 4.10.8 InCharge Smarts InCharge Smarts InCharge won our Editor's Choice award, primarily because it handles correlation better than any other product we tested. Aprisma's Spectrum followed closely, thanks to its strong usability and correlation abilities. Aprisma's super-low pricing for our single-site scenario also makes the product worthy of a Best Value award, despite its high quote in the larger setting. http://www.smarts.com/company/literature/white_papers.shtml Amit vs. InCharge (according to Amit!) SMARTS InCharge is a system network management tool that correlates events by employing a coding technique that matches alarms with signatures of known problems in real-time. A set of events that represent symptoms of a problem is treated as a code that identifies the problem. A codebook is a set of events that must be monitored to distinguish the problems of interest from each other. The supported pattern on event history is a conjunction of events within a time window. The event correlation in the InCharge system is designed to handle only network events. Their expressive power is limited to the network management domain, and they do not aim to provide a general, domain-independent solution that supports the fundamentals of a situation our active technology supports. APPARENT RELEVANCE: High. 18 4.10.9 eAutomation Referenced repeatedly in Ontology-based Correlation Engines by Stojanovic et al. Has an ontology! Apparently-described-in: IBM Tivoli System Automation for Linux on xSeries® and zSeries®, Tec. Report SC33-8210-00, 2002. APPARENT RELEVANCE: Medium-high Can't get info. 4.10.10 EventWatch EventWatch by Tavve http://www.tavve.com/dynamic.asp?id=38&referrer=AdWords EventWatch uses a patented methodology to discover the actual relationships among network entities and automatically build its own correlation database. ......... automatically updates.........you will never need to engage in the time-consuming and onerous task of building and maintaining correlation rules. Seems focused on network & server outages. APPARENT RELEVANCE: can't be sure 4.10.11 AMIT AMIT (Active Middleware Technology), http://www.haifa.il.ibm.com/projects/software/amit/ http://www.haifa.il.ibm.com/projects/software/amit/approach.html Examples: 1. If a platinum customer changed her stock portfolio at least twice this week by more than 10%, and her total investment is more than $1M, initiate a phone call to advise her. 2. If a gold or platinum customer deposited a sum of more than $10K in a checking account and did not withdraw money from the account within 2 days, initiate a phone call to advise him. APPARENT RELEVANCE: medium-low 4.10.12 Yemanja Yemanja is model-based, and uses a backward chaining state engine. 19 For each entity (a device or conceptual component) a problem behavior model is developed. Entity-models contain a set of problem scenarios, a set of input events that they consume, and a set of output events that they publish. Published events can be consumed by another higher-level entity. Since the publisher does not know who its consumers are, adding a new entity-model does not require the modification of any existing entity-models. http://www.mnlab.cs.depaul.edu/seminar/spr2003/yemanja.pdf One reference is Yemini, S., Kliger, S., Yemini, Y., Ohsie, D., High Speed and Robust Event Correlation. IEEE Communications Magazine, May 1996 . APPARENT RELEVANCE: Medium-high 4.10.13 ART*Enterprise ART*Enterprise http://www.brightware.com APPARENT RELEVANCE: low (business rules; probably costly) 4.10.14 NetExpert http://www.osi.com/ APPARENT RELEVANCE: TBD. 4.10.15 NetCool http://www.micromuse.com/index.html APPARENT RELEVANCE: TBD. 4.10.16 Versata Amit vs. Versata (according to Amit!) Versata Studio & Logic Server (former VisualAge Business Rules) uses declarative language for defining rules rather than procedural, or in other words - what, rather than how. It supports business rule types such as Derivation (computational) rules, Validation rules, Presentation rules, Integrity rules and Constraints. Versata translates system requirements to 20 an EJB, a plain Java application, an HTML application, or directly into a relational database schema. Limitations: Versata cannot support non-declarative requirements that cannot be translated to declarative business rules. For example, relationships that are more complex than parent-child (i.e., siblings, cousins, etc.); quantity-based discount schedules; batch driver loops (e.g., notify the contract administrator when a contract's expiration data has passed, if the contract is of type Service and has a value of more than $10,000); workflow, including time-based and calendar-driven rules enforcement; data retrieval with a user-defined business function. Versata does not support time, and users need semi-programmer skills for the process of defining rules. APPARENT RELEVANCE: Medium. 4.10.17 ODE Amit vs. ODE (according to Amit!) ODE (developed in AT&T Bell Laboratories) detects composite events over an event history that contains all event occurrences. This information can only be used to impose some filtering conditions (masks) and equality conditions (parameters) on events that participate in an event expression (composite event). It is also limited to database events only. APPARENT RELEVANCE: Medium. 4.10.18 Snoop Amit vs. Snoop (according to Amit!) Snoop (developed at the University of Florida) supports both database event and external events. It has limited expressive capabilities for the definition of time internals using the operators A, A*, P, and P*, in association with a parameter context. Snoop cannot express all possibilities of event reuse (consumption) policies. Although semantic information is reported with events in Snoop, this information cannot be used during the event composition. APPARENT RELEVANCE: Medium-high. 21 Section 5 Event Normalization An event is an action suitable for sensor auditing. It is common to also term the message generated by a sensor as a result of such an action an event. When an event record has been mapped into the standard (i.e., canonical, universal) format used for subsequent processing, it is normalized. Although event normalization is a necessary precursor of correlation, there is no industry inter-SIM standard for normalized events. At the simplest level, normalization puts information in a variety of formats into a common format. A six-page white paper [DeRodeff 2002] illustrates this with an example, the passage of a packet connecting to IIS servers over port 80 and resulting in a remote printer buffer overflow. The event is shown unnormalized for Checkpoint, Cisco Router, Cisco PIX, and Snort. The paper proceeds to show how normalization for ArcSight is performed, with the four sources being values in a device type field and with an additional data field. This approach has critical problems which can be described using the netForensics context, which we were able to study in a three day lab course. The netForensics event console is essentially a Table with a row for each primary event. The rows are normally sorted by column with the date/time of the event. Of the 40 or so columns, several, such as destination port, are relevant mainly for IP packets. There is not, for example, a column for: the badge number of a person who entered a restricted area. the filename of a file which was modified. the username under which a file was modified. the id of the token used to authenticate to a RADIUS authentication server. There are customer-specific-A and customer-specific-B columns. These fields will be unused by most customers and correlation rules developed by this customer using these fields will be useless for sharing across the (vendor-specific) community. There is an nF Alarm Category column with one of eight values: Virus/Trojan, Unknown/Suspicious, System Status/Configuration, Reconnaissance, Denial of Service, Access/Authentication/Authorization, Application Exploit, Policy Violations. We were told that a ninth category is being added! There is little likelihood that these categories have desirable characteristics such as being mutually exclusive and being exhaustive of all possibilities. This category column does not represent raw data. Rather, it represents a 23 simple inference about the event and as such, and it can be questioned whether it suitable as part of the normalized event. It might perhaps be better generated dynamically from a combination of normalized event, knowledge base, and query as the events are displayed. While extensibility indicates that arbitrary incremental improvements are possible it also indicates that the existing normalized format has an unknown list of inadequacies which are being discovered one-by-one. If more categories are added to the alarm categories column, or if more columns, some customer-specific, are added, or if an unformatted ‘additional data’ column is present this approach will soon be overwhelmed. Rather, a standards effort should be initiated which will attempt to characterize all security-relevant events and for each, the raw data features (attributes) which define the event. Only the sensor identifier should be in the normalized event record: sensor attributes should be accessed as needed when the event data is used. Secondary (meta) event standardization should be attempted at the same time. These events may have a different normalization and appear in a different event trace, since they are at a higher level of abstraction. One promising thrust toward solving the normalization problem is discussed in [Stojanovic 2004]. The notion is that, if you restrict event to situations involving a change of state of a resource, an event can be described as a pair. One element is the state change (up to down, old password to new password, …). The other describes the resource whose state changed. Further, by defining an ontology of resources, the resource can be identified accurately and meaningfully in the event message.8 A rule might generate a secondary event identifying a more general resource (e.g., customer relations’ multi-site server farm) where several instances of that resource (e.g., server a at site b and others) had been identified in recent primary events. A simple Dewey-decimal-system-like hierarchy is only marginally adequate here – thus the need for an ontological approach. 8 24 Section 6 The Correlation Language Spectrum An event correlation engine whose processing of the trace is specified by rules will attempt to use reasoning (according to the rules) on what is known (the trace window) to determine a manageably small and prioritized list of alerts. In addition to the event normalization issue discussed above, there are two major issues related to rule-based event correlation. First, some sort of selection of events within the trace which are within scope is a practical necessity. The most obvious way to do this is by excluding events too far in the past and too far in the future. See the Sliding Window section above. For in-memory analysis, a second plausible way, more rough-and-ready, is to proceed with analysis on a window/chunk/strip of the event trace which uses most of the physical memory available regardless of the time interval this offers between oldest and newest events. In addition to the inevitable windowing of the event trace, it is possible to re-inject summary events either into the trace or a separate meta-trace. Certain rules will then look not only to the current trace window but also to the meta-trace. The second major issue relates to the rule processing engine. Although not simply described, there is a spectrum of rule processing control structures and rule languages with associated logics. The spectrum has simplicity with efficiency on one end and power and generality on the other. A control structure may offer some degree of parallelism. If the parallelism is enforced at the rule language level by warnings that the rule writer may not assume any particular order of execution, then there is the potential for better performance with a multiprocessor engine. Among the simple, efficient, and powerful capabilities a rule engine may offer is caching. A caching rule asserts the knowledge just reasoned to be true back into the knowledge base (likely the trace, in the case of an event correlation rules engine) as a fact. The JESS assert and Prolog asserta or assertz predicates, for example, implement caching. The choices of control structure and of logic are substantially independent but not all combinations are possible. For example, any use of caching or other side-effect is likely to assume a particular order of processing subgoals of a rule ANDed together or alternate ways to satisfy a rule ORed together. Therefore logic caching and control structure parallelism are incompatible. Another simple, powerful and potentially very fast rule-based correlation engine is restricted to AND, OR, and NOT rules without binding of variables to values. This is a pure 25 data-driven situation where a number of Boolean input facts propagate conjunction, disjunction or negation values rootward in a tree. After a limited number of steps the root truth value of the entire tree has been determined. Regardless of the power of the rules language, an engine will assign relative priority to rules, facts, and queries (goals or conclusions). When queries are given top priority, a backward chaining engine is implied. This is associated with goal-directed reasoning and with demand-driven processing architectures. The engine is activated by a query, typically submitted to a read-eval-print command processor. The engine tries to find a way to show that the query (Did someone gain root after guessing a password, ...) is true, binding variables as necessary to specific values (This IP address, ...). The system typically can be asked if there are other variable bindings which make the query true, giving rise to a list of answers. (Maybe there are several configuration details, each of which imply that the system been rooted!). Rules and facts are typically prioritized by their order in the knowledge base, (when the potential for parallelism is not being offered). Prolog gives a certain amount of control over this priority by offering two varieties of the assert clause (assert at beginning of list, assert at end of list). Backward chaining is appropriate when the ratio of facts to conclusions (goals) is large. The opposite approach to rule processing is to give facts top priority: this forward chaining approach is associated with data-driven processing architectures. When the facts must be gathered by human effort, as from a medical patient, a questionnaire is often used. (When there are nevertheless many facts and some of them can be quickly determined to be irrelevant, the questionnaire may be structured to reflect this knowledge.) Jess, Datalog, and event-condition-action systems in general [Ullman 1997] are forward-chaining. Forward chaining is particularly appropriate when the ratio of facts to possible conclusions is small or when recently-discovered facts are of particular interest. Forward chaining may do lots of work that is irrelevant to the goal or goals of interest. As a simple example, consider the following knowledge base. Facts: cat(Felix) cat(Lulu) lives_with(Felix, Judie) lives_with(Lulu, Fred) is-allergic-to-cats(Judie) Rules: has_allergic_reaction(X) -> sneeze(X) ( cat(Y) ^ lives_with(Y,X) ^ is-allergic-to-cats(X) ) -> 26 has_allergic_reaction(X) Goal: sneeze(Judie) The backward chaining approach would start by posing the question, Did Judie sneeze?. The engine would discover any (and all, if asked repeatedly) explanations for concluding ’yes’. The forward chaining approach would say Well, can we infer something we’re interested in, given our general knowledge plus the facts that Felix and Lulu are cats and we know who they live with and we know Judie is allergic to cats? In summary, backward chaining is suitable for answering questions against an enduring body of knowledge while forward chaining is suitable for reacting to a stream of input facts. Control structures which are hybrids of forward and backward chaining are also possible. A notable example is the rule-cycle hybrid described in sections 6.4 and 7.9 of [Rowe 1988]. Concluding the forward/backward chaining discussion, we note that a system capable of meta-rules such as Prolog can be arranged/programmed to perform forward chaining [Rowe 1988] while a forward chaining system such as Jess can be arranged to perform backward chaining. Language elements (for example, Prolog’s ’univ’ and ’clause’ as well as ’asserta’, assertz, and retract) may allow rules which create or alter other rules. To the extent that a fact is a special case of a rule whose conditions are always satisfied, we have already covered this under caching, above. However, we place the more sophisticated use of facts and rules to construct other rules near the top end of the spectrum of power and generality. The spectrum from simplicity with efficiency to power and generality becomes more complex as one moves above Boolean predicate logic. More powerful than predicate logic but less powerful than full first order logic are a large number of description logics. Description logics [Baader 1991] lack variables but have, in general, the characteristics of guaranteed termination of processing an arbitrary query with proved/true or false and not with unproved. They have at least the power of ALC, as described in [Schmidt 1991]. Table 3. Expressivity Attributes of a Logic, below, describes a number of expressivity attributes a given logic and its corresponding event correlation language may possess. For a review in less than a page of propositional logic, see [Sowa 2004a]. Likewise see [Sowa 2004b] for a one-page explanation of the entire essence of predicate logic. 27 Table 3. Expressivity Attributes of a Logic Language/Logic Feature propositions, with operators such as AND, OR, NOT Comments The essence of propositional logic, where truth tables are a central consideration. closure Closure implies that any expression of expressions is itself an expression. co reference In order for two expressions to reference the same expression or quantifier, variables or some equivalent are required. N (simple number restrictions) Entry Two Q (qualified number restrictions) Entry Two quantification (existential and The first big step up from predicate universal) logic. Instead of just concepts (predicates), instances of a concept are expressible as in ‘there exists a person, p’ or ‘for all x y(x)’. NOT without the closed-world In a closed world, it can be assumption assumed that no unidentified instances of a concept exist. Therefore ‘there exists’ and ‘for all’ can be determined to be true or false, rather than unknown. R (roles and role conjunctions but A role is a binary relation between no role hierarchies) two concepts. Roles allow assertions that specific instances of two concepts are related. As an example, instance 5 of the concept agent may be related by the role causes to instances 1 and 9 of the concept action. h or H (role hierarchies with single Entry Two or multiple inheritance) R+ (transitive roles) Entry Two I (inverse roles) Entry Two 28 Language/Logic Feature time, place, speaker, listener Comments Instead of an expression simply being true, unknown, or false, an expressive language will allow the assertion that it is true during a certain time interval or at a certain place. It may be asserted by a certain speaker or heard by a certain listener. In general, it is difficult to provide closure for these sorts of expressions. Full first order logic is perhaps the most powerful and general non-procedural language which realistically might be used as an event correlation language. An informative and current reference on logic/language expressivity is section 2.5, Computational instantiations of ontologies in [Bateman 2004]. The notion that use of these languages for knowledge representation (or query statement) should be restricted to limited subsets of logic is derived from some simple facts combined with a dubious assumption [Sowa 2003]. 1. Various problems belong to different complexity classes, such as undecidable (possibly infinite time to compute), intractable (exponential or worse amounts of time), tractable (polynomial time), and scalable (linear, logarithmic, or constant time). 2. The complexity class of certain kinds of problems can be determined by syntactic tests on the problem statement that determine an upper bound on the amount of time required to solve the problem. 3. But the upper bound is not equal to the lower bound. Any constraint that eliminates some class of problems as inefficient will also eliminate infinitely many problems that are very easily (i.e., efficiently) computable. The dubious assumption is that the syntax of a KR language should be limited in expressive power in a way that prohibits the expression of any problem that cannot be limited to a certain complexity class. In taking a fresh look at event correlation languages then, complexity class should not be over-emphasized. Although first order logic reasoners are becoming more performant and description logic reasoners are becoming more powerful, there is considerable research affecting the best choice of language expressivity for a practical event correlation engine. Description logics 29 have been somewhat in the shadows throughout the 1990s and there is the hope that in the next decade, robust and more expressive logic engines will be available. Very near the powerful/general end of the spectrum (powerful languages with great generality) are the procedural rule languages such as the Java-related ABLE-RL. These languages exceed even full first-order logic in power but the class of rules having a given result is huge whereas the ideal class would have only a single way to specify a given rule. Ideally then, rules will be largely or even totally self-documenting and there will be no need for rule writing style guides and programming conventions. Figure 1. Logic Languages910 is a diagrammatic representation of the features and languages discussed above. Particularly in its interactive form, it packs a great deal of knowledge into an easily-apprehended form. The “4 more” additional features not visible in the scrolling window for the intent of AL are Universal Quantification, Limited Existential Quantification, Role Names, and Concept Subsumption. 10 The “2 more” languages in the extent of the concept near the bottom of the figure are CycL and KIF. 9 30 Figure 1. Logic Languages An interactive Formal Concept Analysis was defined for the spectrum of logic languages. (See the information in the appendix Context Table for Logic Language Spectrum.) Using the application ‘Toscanaj’ the node with FaCT and two other languages was selected and then the diagram of Figure 1. Logic Languages was captured. The selected node highlighted 31 upward, showing each language concept included in the selected concept and downward showing each language concept with additional features. Looking upward at the boxes above a node, one sees the feature intent of the selected concept. Looking downwards at the boxes below a node one sees the specific languages which are instances of that language concept. These instances are said to make up the extent of the selected concept. In figure 1 we see, for example, that DAML+OIL is reported to have all the features of FaCT, plus inverses and role subsumption.11 11 In this current working note version the figure should be regarded as merely a sketch: needed additional investigation and peer review is lacking. 32 Section 7 Conclusion and Recommendations If SIM systems are to realize their potential for event correlation, the two research goals noted in the Executive Summary will necessarily be pursued. Without progress in these two areas, initial enthusiasm for SIMs will be replaced by a consensus that they are too demanding of resources to be useful. This paper should be revised. The intent (structuring) of the formal concept analysis of logic languages should receive peer review. Each of the identified existing event correlation languages should be included in the extent of this analysis. The table Expressivity Attributes of a Logic should be revised and expanded to include and explain all of the features used in the analysis. 33 List of References [Baader 1991] Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider, editors. The Description Logic Handbook. Cambridge University Press, 2002. [Bateman 2004] Farrar, Scott and Bateman, John, ONTOSPACE Project General Ontology Baseline Deliverable D1, November 2004. [Cooper 1988] Cooper, T. and Wogrin, N., Rule-Based Programming with OPS5. MorganKaufmann Publishers, 1988. [DeRodeff 2002] DeRodeff, Colby, Got Correlation? Not Without Normalization, ArcSight White Paper, ArcSight Inc. 2002. [Eckmann 2000] Eckmann, Steven T., Vigna, Giovanni, and Kemmerer, Richard A., [eckmann,vigna,kemm]@cs.ucsb.edu, STATL: An Attack Language for State-based Intrusion Detection, Reliable Software Group, Department of Computer Science, University of California, Santa Barbara, CA [MorDeb 2002] Morin, Benjamin and Debar, Herve, Correlation of Intrusion Symptoms: an Application of Chronicles, France Telecom R&D, Caen, France fbenjamin.morin|herve.debarg@rd.francetelecom.com [Rowe 1988] Rowe, Neil C., Artificial Intelligence Through Prolog, Prentice-Hall, 1988. [Sowa 2004a] http://www.jfsowa.com/logic/math.htm#Propositional [Sowa 2004b] http://www.jfsowa.com/logic/math.htm#Predicate [Sowa 2003] http://grouper.ieee.org/groups/suo/email/msg09051.html [Schmidt 1991] Schmidt-Schauß, Manfred and Smolka, Gert. Attributive concept descriptions with complements. Artificial Intelligence, 48(1):1-26, 1991. [Stojanovic 2004] Stojanovic. The role of ontologies in autonomic computing systems. IBM Systems Journal, Vol. 43, No. 3, 2004. [Ullman 1997] Ullman J.D. and Widom J., A First Course in Database Systems, PrenticeHall, 1997. 35 Appendix Context Table for Logic Language Spectrum The following data is an object attribute list (OAL). It represents the formal context of various language objects and their features. It should be imported by the siena editor and written as a dot-csx file. The csx file can be viewed by the ToscanaJ viewer of the ToscanaJ shareware package (http://toscanaj.sourceforge.net/ ). The OAL can be used as a starting point for revisions and corrections to the formal context analysis of logic languages. Propositional Logic:Propositions;Proposition Conj;Proposition Disj;Proposition Negation; AL:Concepts;Concept Conj;Concept Negation;Universal Quant;Limited Existential Quant;Role Names;Concept Subsumption;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALU:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Role Names;Concept Subsumption;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALE:Concepts;Concept Conj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALN:Concepts;Concept Conj;Concept Negation;Universal Quant;Limited Existential Quant;Role Names;Concept Subsumption;Number Restrictions;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALC (Description Logic):Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCI:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCN:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Number Restrictions;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCQ:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Qualified Number Restrictions on Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; 37 ALCH:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Role Subsumption;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCR+:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Transitivity over (Primitive) Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; S:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Transitivity over (Primitive) Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCD:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Domains of Specified Datatypes;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCDFD:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Domains of Specified Datatypes;functions;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; ALCQHIR+:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Qualified Number Restrictions on Roles;Transitivity over (Primitive) Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; SHIQ:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Qualified Number Restrictions on Roles;Transitivity over (Primitive) Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; DAML+OIL:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; FaCT:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Qualified Number Restrictions on Roles;Transitivity over (Primitive) Roles;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; RACER:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Qualified Number Restrictions on Roles;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; 38 Full First Order Logic:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Propositions;Proposition Conj;Proposition Disj;Proposition Negation; CASL:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Propositions;Proposition Conj;Proposition Disj;Proposition Negation; SUO-KIF:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Propositions;Proposition Conj;Proposition Disj;Proposition Negation; CycL:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Propositions;Proposition Conj;Proposition Disj;Proposition Negation; KIF:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Propositions;Proposition Conj;Proposition Disj;Proposition Negation; Second Order Logics:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Quantified predicate vars;Quantified function variables;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; Higher Order Logics:Concepts;Concept Conj;Concept Disj;Concept Negation;Universal Quant;Limited Existential Quant;Existential Quant;Role Names;Concept Subsumption;Inverses;Number Restrictions;Qualified Number Restrictions on Roles;Role Subsumption;Transitivity over (Primitive) Roles;Domains of Specified Datatypes;functions;Coreference (Variables);Quantified predicate vars;Quantified function variables;Propositions;Proposition Conj;Proposition Disj;Proposition Negation; 39 Glossary Action: a) the changes invoked by a rule that succeeds. or b) a change in system state having a moment and place of occurrence caused by an agent. Alert: A primary or secondary event placed onto the SIM system console for human evaluation and action. Assertion: An item in the database, usually describing either a primary or secondary event. Assertions will always be normalized. Event: a) An action suitable for sensor auditing or b) the message generated by a sensor as a result of such an action. Discussion: Definition b, while used by SIM vendors, leaves one with the comforTable but erroneous impression that a false negative (an action for which no sensor could or did generate a message) is a non-event. In [Stojanovic 2004], for example, an event is taken to be a message. But of primary interest here is the fact that event is further constrained to be a special kind of a message generated by a resource in the domain that indicates a change of state of that resource. While resources and their state are of interest, this definition of event may be too narrow for human-intentional probing actions or initial steps in a scenario which might, when the last step is completed, bring a system down. In addition, by making the resource/system and the reporting sensor indistinguishable, this definition assumes that a system can be trusted to accurately report on its state. Event, Commensurable: A secondary event in normalized form, suitable for injection into the recorded event stream. Event, Primary: An event detected and reported by a sensor. Event, Secondary: An event inferred by a correlation rule. Occasionally called meta event. Can be implemented by caching/asserting. Event, Normalized: An event record mapped into the standard (canonical, universal) intraSIM format of a particular SIM. (There is no industry inter-SIM standard for normalized events.) Predicate: an assertional function of one or more variables whose value, in a sufficientlyspecified contex, is true or false. Proposition: an indivisible assertion whose value, in a sufficiently-specified context, is true or false. Risk: a combination of (1) the likelihood that an undesirable impact will occur in some period of time, and (2) the undesirable impact. Rule: Specifications, written in a rule language, sufficient to generate a secondary event upon processing a certain type of trace. 41 Rule Block: A coordinated collection of rules sufficient to generate an alert. Sometimes called a directive. The SIM System Console: A putative teletype-like scrolling window on which all alerts appear, in order of occurrence. Trace: The finite sequence of sensor-generated, normalized, time-sequenced events input to a event correlation engine. 42 Please do not delete these paragraphs or the final end-of-section mark in your document. They are important for correct functioning of the RoboTech technical document template. RoboTech: Version 3.0