IBM Haifa Research Lab – Event Processing Event processing – past, present, future VLDB 2010 Tutorial, Singapore, September 15th, 2010 Opher Etzion (opher@il.ibm.com) © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing An example to kick-off this tutorial: THE LUGGAGE PERSPECTIVE: Act: Passenger has been rerouted to another destination – send the luggage Event processing can help here.. Across the 24 largest airlines more than 5.6 million bags went missing in 2006, this is an average of 15.7 bags per 1,000 travelers. 15% of the bags are never found. BBC News, April 4, 2007 Act: Bag has reached to the wrong aircraft 2 Notify: Bag has been checked but did not reach the ULD within 20 minutes Notify: Bag has been checked but did not reach the connecting flight © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Another example: premature baby monitoring Patient is hooked up to multiple monitors (in hospital or at home) - the physician can set up event- based rules on multiple measurements and patient’s history when to send an alert and to whom: Defined Pattern 3 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Outline of this tutorial What is behind event processing ? How is it related to other computing terms? Where are its roots? Event processing – architecture, building blocks The present: State of the practice in event processing – languages, implementation issues, challenges in implementing event processing applications The future: Trends and research challenges 4 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing What is behind event processing ? How is it related to other computing terms? Where are its roots? 5 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing What is “event processing” anyway? or Event processing is a form of computing that performs operations on events 6 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing In computing we ptocessed events since early days Network and System Management 7 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Emerging technologies in enterprise computing (Gartner Hype Cycle, Summer 2009) 8 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing What’s new? The analog: moving from files to DBMS In recent years – architectures, abstractions, and dedicated commercial products emerge to support functionality that was traditionally carried out within regular programming. For some applications it is an improvement in TCO; for others is breaking the cost-effectiveness barrier. 9 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing What is an event There are various definitions of events Event is an occurrence within a particular system or domain; the word event has double meaning: the real-world Occurrence as well as its computerized representation 10 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing In daily life we often react to events.. 11 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Sometime we even react to the occurrence of multiple events I closed the deal with the Australians I closed the deal with the Canadians We closed two huge deals in a single day, It is a good opportunity to send all the team to Some fun time in Singapore 12 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Pattern detection is one of the notable functions of event processing Event Patterns 13 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing What we actually want to react to are – situation Toll violation Frustrated customer TOLL Sometimes the situation is determined by detecting that VILOATOR some pattern occurred in the Flowing events. 14 Sometimes the events can approximate or indicate with FRUSTRATED some certainty that the CUSTOMER situation has occurred © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Event processing is being used for various reasons 15 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Event Driven Architecture Event driven architecture: asynchronous, decoupled; each component is autonomic. 16 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ancestor: active databases On event Composite events were inherited to event processing When condition Do action With coupling mode 17 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ancestor: Data Stream management system Source: Ankur Jain’s website 18 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ancestor: Temporal databases There is a substantial temporal nature to event processing. Recently – also spatial and spatio-temporal functions are being added 19 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ancestor: Messaging – pub/sub middleware 20 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ancestor: Discrete event simulation 21 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ancestor: Network and system management 22 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Event processing – architecture, building blocks 23 © 2010 IBM Corporation Location Service Fast Flower Delivery Bid System Delivery Request Location Bid Request Store Ranked drivers / Preferences automatic assignment Delivery Bid Manual Assignment Flower Store GPS Location Assignment Assignment System Assignments, Van Driver Bid alerts, Assign Alerts Pick Up confirmation Delivery Ranking and Reporting System confirmation Assignments, Pick Up Alert Driver’s Guild Control System Delivery Alert 24 Ranking and reports IBM Haifa Research Lab – Event Processing Event processing network 25 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Example of EPN – part of the FFD example 26 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing The seven Event Channel Building blocks Event Producer Event Consumer Event Type Event Processing Agent Context Global State 27 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Event type definition Header System defined event attributes Payload Attributes specific to the event type 28 Open content Additional free format data included in the event instance © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Producer – State Observer in workflows State observer Push: Instrumentation points; Pull: Query the state 29 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Producer – Code instrumentation 30 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Producer – syndication 31 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Producers – streams to events 32 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Producer – sensors 33 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Consumer - Performance monitoring dashboard 34 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Consumer - Ambient Orb 35 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Producer and consumer - Sixth sense 36 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Twitter as a consumer 37 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Event Processing Agents Event Processing Agent Filter Translate Enrich Transform Aggregate Detect Pattern Split Compose Project 38 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing The EPA picture Pattern detect EPA Output Input terminal filter expression Pattern matching set Derivation Instance selection Derivation expression Matching Context expression Pattern signature: Pattern type Pattern parameters Relevant event types Pattern policies Relevance filtering Relevant event types Participant events 39 Not selected © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing A filter EPA is an EPA that performs filtering only, and has no matching or derivation steps, so it does not transform the input event. Filter EPA Filtered In Filter EPA Filtered Out Filtering Principal filter expression Non-Filterable 40 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Transform EPA sub types Translate Compose Aggregate Enrich Split Project 41 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Sample of pattern types all pattern is satisfied when the relevant event set contains at least one instance of each event type in the participant set any pattern is satisfied if the relevant event set contains an instance of any of the event types in the participant set absence pattern is satisfied when there are no relevant events relative N highest values pattern is satisfied by the events which have the N highest value of a specific attribute over all the relevant events, where N is an argument value average pattern is satisfied when the value of a specific attribute, averaged over all the relevant events, satisfies the value average threshold assertion. always pattern is satisfied when all the relevant events satisfy the always pattern assertion sequence pattern is satisfied when the relevant event set contains at least one event instance for each event type in the participant set, and the order of the event instances is identical to the order of the event types in the participant set. increasing pattern is satisfied by an attribute A if for all the relevant events, e1 << e2 e1.A < e2.A relative max distance pattern is satisfied when the maximal distance between any two relevant events satisfies the max threshold assertion moving toward pattern is satisfied when for any pair of relevant events e1, e2 we have e1 << e2 the location of e2 is closer to a certain object then the location of e1. 42 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Pattern detection example Find the five highest bids within the bid interval Pattern name: Manual Assignment Preparation Pattern Type: relative N highest Context: Bid Interval Relevant event types: Delivery Bid Taken from the Fast Flower Delivery use case Pattern parameter: N = 5; value = Ranking Cardinality: Single deferred 43 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Why do we need policies - A simple example: heavy trading scenario Given: A stream of events of a single topic, about the activity in the stock market for a certain stock. Why defining patterns is not that easy? Because we need to tune up the semantics An event is produced every 10 minutes when there is trade in the stock. Each event consists of: quote (current stock-quote), volume (an accumulated volume of traded events within these 10 minutes). Event-Id E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 A selection specification: “trigger an automatic trade program if the volume exceeds 300,000 3 times within an hour; pass as an argument the last quote and the sum of the 3 volume values”. Time-Stamp 9:00 9:10 9:20 9:30 9:40 9:50 10:00 10:10 10:20 10:40 10:50 11:00 Quote 33.23 33.04 33.11 33.01 32.90 33.04 33.20 33.33 33.11 33.00 32.78 32.70 Volume 320,000 280,000 400,000 315,000 320,000 303,000 219,000 301,000 210,000 400,000 176,000 How many times the trade programming is triggered ; Which arguments are used in each triggering? 44 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Pattern policies Evaluation policy—This determines when the matching sets are produced Deferred Cardinality policy—This determines how many matching sets are produced within a single context partition Repeated type policy—This determines what happens if the matching step encounters multiple events of the same type Every - Override – First – Last – with max/min value of… 45 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Pattern policies – cont. Consumption policy—This specifies what happens to a participant event after it has been included in a matching set Consume Reuse Bounded reuse Order policy—This specifies how temporal order is defined By occurrence time – by detection time – by stream position – by attribute 46 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Our entire culture is context sensitive In the play “The Tea house of the August Moon” one of the characters says: Pornography question of geography •This says that in different geographical contexts people view things differently •Furthermore, the syntax of the language (no verbs) is typical to the way that the people of Okinawa are talking When hearing concert people are not talking, eating, and keep their mobile phone on “silent”. 47 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Context has three distinct roles (which may be combined) The events that relate to each customer are processed separately Partition the incoming events Grouping together events that happened in the same hour at the same location Grouping events together Different processing for Different context partitions Determining the processing 48 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Context Definition A context is a named specification of conditions that groups event instances so that they can be processed in a related way. It assigns each event instance to one or more context partitions. A context may have one or more context dimensions. Temporal Spatial State Oriented Segmentation Oriented 49 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Context Types Examples Segmentation Oriented “All Children 2-5 years old” “All platinum customers” Temporal Spatial “Every day between 08:00 and 10:00 AM” “A week after borrowing a disk” Context “3 miles from the traffic accident location” “Within an authorized zone in a manufactory” “A time window bounded by TradingDayStart and TradingDayEnd events” State Oriented “Airport security level is red” “Weather is stormy” 50 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Context Types Segmentation Oriented Temporal Spatial Fixed interval Fixed location Event interval Entity distance location Context Sliding fixed interval Event distance location Sliding event interval State Oriented 51 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing The present: State of the practice in event processing – languages, implementation issues, challenges in implementing event processing applications 52 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing An Observation The Babylon Tower symbolizes the tendency Of humanity to talk in multiple languages. The Event Processing area is no different: most languages in the industry really follow the hammer and nails syndrome – and extended existing approaches • imperative script language • SQL extensions • Extension of inference rule language It does not seem that we’ll succeed to settle In the near future around a single programming style The epts language analysis workgroup is aimed to understand the various styles And extract common functions that can be used to define what is an event processing language; this tutorial is an interim report 53 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Existing Styles for EP languages (samples) WBE Prova TIBCO Inference Rules XChangeEQ ECA Rules Starview EventZero AMiT Agent Oriented RuleCore Agent Logic Spade SQL extension Aleri Coral8 Esper Netcool Impact Streambase Oracle State oriented Imperative/ Script Based Apama * - if we add simple and mediated event processing the picture is even more diversified 54 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing StreamBase Studio 55 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing StreamBase Pattern Matching 56 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing CCL Studio (Coral8 Sybase) 57 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing CCL – Pattern Matching RFID monitoring application Checks if a tag has been seen by readers A and B, then C, but not D, within a 10 second window. Insert into Select From Matching On StreamAlerts StreamA.id StreamA a, StreamB b, StreamC c, StreamD d [10 seconds: a && b, c, !d] a.id = b.id = c.id = d.id 58 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Microsoft Streaminsights var topfive = (from window in inputStream.Snapshot() from e in window orderby e.f ascending, e.i descending select e).Take(5); var avgCount = from v in inputStream group v by v.i % 4 into eachGroup from window in eachGroup.Snapshot() select new { avgNumber = window.Avg(e => e.number) }; 59 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Esper EPL – FFD Example /* * Not delivered up after 10 mins (600 secs) of the request target delivery time */ insert into AlertW(requestId, message, driver, timestamp) select a.requestId, "not delivered", a.driver, current_timestamp() from pattern[ every a=Assignment (timer:interval(600 + (a.deliveryTime-current_timestamp)/1000) and not DeliveryConfirmation(requestId = a.requestId) and not NoOneToReceiveMSG(requestId = a.requestId)) ]; 60 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing ruleCore - Reakt Event stream view - a unique context of events a view contains a window into the inbound stream of events and contains commonly only semantically related events Situation - an interesting combination of multiple events as they occur over time An item with an RFID tag being picked up from the shelf and then moving past the checkout without being paid for Rule - an active event processing entity reacting to specific combinations of inbound events over time Action - the last part of a rule's evaluation in response to a detected situation 61 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Amit Terminology Event Selection e2 Input events e1 e3 Keys Conditions Operation Operator Joining Counting Temporal Absence Aggregation Actions Notifications Messages Definition updates User plug-ins Situation e5 Lifespan e8 Terminator Initiator 62 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Amit - Situation 63 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing IBM Websphere Business Events 64 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Apama EPL – FFD Examples 65 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Apama Simulation Studio – cont. 66 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Non functional properties - scalability Scalability is the capability of a system to adapt readily to a greater or lesser intensity of use, volume or demand while still meeting its business objectives , . scalability in the volume of processed events Scalability in the quantity of agents Scalability in the quantity of producers scalability in the quantity of consumers scalability in the quantity of context partitions Scalability in context state size Scalability in the complexity of computation scalability in the processor environment 67 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Non functional requirements - security Ensuring only authorized parties are allowed to be event producers of event consumers Ensuring that incoming events are filtered so that authorized producers cannot introduce invalid events, or events that they are not entitled to publish Ensuring that consumers only receive information to which they are entitled. In some cases a consumer might be entitled to see some of the attributes of an event but not others. Ensuring that unauthorized parties cannot add new EPAs to the system, or make modifications to the EPN itself (in systems where dynamic EPN modification is supported) Keeping auditable logs of events received and processed, or other activities performed by the system. Ensuring that all databases and data communications links used by the system are secure. 68 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Performance objectives Objective Number 1 Objective Name MAX input throughput 2 Maximize the quantity of input events processed by a certain system or sub-system within a given time period MAX output throughput 3 Objective metrics Maximize the quantity of derived events produced by a certain system or sub-system within a given time period MIN average latency Minimize the average time it takes to process an event and all its consequences in a certain system or subsystem 4 Min maximal latency Minimize the maximal time it takes to process an event and all its consequences in a certain system or subsystem 5 Latency leveling Minimize the variance of processing times for a single event or a collection of events in a certain system or subsystem 6 Real-time constraints Minimize the deviation in latency, from a given value, for the processing of an event and all its consequences in a certain system or sub-system. 69 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Optimizations Optimizations related to EPA assignment: partition, parallelism, distribution and load shedding. Optimizations related to the coding of specific EPAs: code optimization, state management. Optimization related to the execution process: scheduling, routing optimizations and load shedding 70 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Putting derived events in order 71 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Ordering in a distributed environment possible issues The occurrence time of an event is accurate, but the event arrives out-of-order and processing that should have included the event might already been executed. Neither the occurrence time nor detection time can be trusted, so the order of events cannot be accurately determined. 72 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Solutions: Retrospective compensation: Buffering techniques: Assumptions: Events are reported by the producers as soon as they occur; The delay in reporting events to the system is relatively small, and can be bounded by a time-out offset; Events arriving after this time-out can be ignored. Principles: Let be the time-out offset, according to the assumption it is safe to assume that at any time-point t, all events whose occurrence time is earlier than t - have already arrived. Each event whose occurrence time is To is then kept in the buffer until To+, at which time the buffer can be sorted by occurrence time, and then events can be processed in this sorted order. 73 Find out all EPAs that have already sent derived events which would have been affected by the "out-oforder" event if it had arrived at the right time. Retract all the derived events that should not have been emitted in their current form. Replay the original events with the late one inserted in its correct place in the sequence so that the correct derived events are generated. © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Inexact event processing uncertainty whether an event actually occurred inexact content in the event payload inexact matching between derived events and the situations they purport to describe Propagation of inexactness Source malfunction Malicious source Uncertain event Projection of temporal anomalies Inexact event content 74 Imprecise source Sampling or estimate © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing False positives and false negatives False positive situation detection refers to cases in which an event representing a situation was emitted by an event processing system, but the situation did not occur in reality. False negative situation detection refers to cases in which a situation occurred in reality, but the event representing this situation was not emitted by an event processing system 75 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Retraction Order Cancellation Assignment issued Pick-up occurred Undoable May still be cancelled with some penalty Bid Request issued Notify assigned driver about cancellation Cancel bid and assignment process Cancel bid preparation Delivery Request issued Delivery occurred 76 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing The future: Trends and research challenges 77 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend I: Going from narrow to wide Some recently reported applications (EPTS use-cases WG) Border security radiation detection Mobile asset geofence Logistic and scheduling Unauthorized use of heavy machinery Hospital patient and asset tracking Activity monitoring for taxing and fraud detection Intelligent CRM in banking EDA and asynchronous BPM in retail Situation awareness in energy utilities Situation awareness in airlines Reduce cost in injection therapy Next generation navigation Real-time management of hazardous materials Source: ebizQ Event processing market pulse Finding anomalies in point of sales in retail stores Elderly behavior monitoring 78 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend I: Going from narrow to wide Taking event processing outside enterprise computing: Home Automation Robotics Bio-Informatics Socio-technical systems 79 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend II: Going from monolithic to diversified Variety of functions Variety of Quality of Service requirements: Variety of platforms “One size fits all” will not work – Instead a collection of building blocks that can fit together 80 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend III: Going from proprietary to standard-based – standard directions The current situation: Babylon tower: variety of languages, event representation… Serves as enabler to achieving other trends and general maturity The shift of vendors from start-up dominant to bigger companies makes the atmosphere more friendly towards standards. Areas for Standards: Modeling Event representation Interoperability Languages 81 PIM © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend IV: Going from programmer centered to semi-technical person centered Source: ebizQ Event processing market pulse 82 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend V: Going from stand-alone to embedded Packaged applications Middleware and platforms Business Activity Monitoring Sensor Platform 83 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Trend VI: Going from reactive to proactive End of game TRAFFIC JAM 84 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Emerging directions: Four directions to observe Multiple platforms – same look and feel Tailor-made optimizations The engineering of constructing EP applications Adding intelligence to Event processing 85 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Emerging direction I: multiple platforms – same look and feel Virtual Event Processing Platform Appliance Appliance Stream Stream Platform Platform Cloud Cloud computing computing Platform Platform ESB ESB // Messaging Messaging Platform Platform Embedded Embedded 86 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Emerging direction II: optimizations Tailor-made Local optimizations: each EPA will be optimized for its own purpose / assumptions / QoS indicators – average/worst case latency, input/output throughput … Global optimization: scheduling, load balancing, assignment… Global optimizations Producer EPA Producer EPA Consumer EPA EPA EPA Consumer Local optimizations 87 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Emerging direction III: Event processing software engineering Modeling & meta-modeling Methodologies Design Patterns Best practices 88 © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Emerging direction IV: Intelligent event processing Offline and continuous mining of meaningful patterns in event histories Inexact event processing – handling inexact events and also false positives and false negatives 89 Causality – a key for proactive, but also vital for provenance © 2010 IBM Corporation IBM Haifa Research Lab – Event Processing Already attracted coverage of analysts and all major software vendors Summary Event processing has emerged from some academic disciplines The state of the Practice is the 1st generation of products – mainly engineering based 90 Going to the next phase – many challenges that require collaboration of research and industry © 2010 IBM Corporation