Overview of Microsoft StreamInsight Torsten Grabs Lead Program Manager Microsoft StreamInsight The Need for an Event-Driven Platform Analytical results need to reflect important changes in business reality immediately and enable responses to them with minimal latency Database Applications Event-driven Applications Query Paradigm Ad-hoc queries or requests Continuous standing queries Latency Seconds, hours, days Milliseconds or less Data Rate Hundreds of events/sec Tens of thousands of events/sec or more Query Semantics Declarative relational analytics Declarative relational and temporal analytics request response Event input stream output stream 2 Scenarios for Event-Driven Applications Latency Months CEP Target Scenarios Days Relational Database Applications Operational Analytics Applications, e.g., Logistics, etc. Data Warehousing Applications Web Analytics Applications hours Minutes Seconds 100 ms Manufacturing Applications Monitoring Applications Financial trading Applications < 1ms 0 10 100 1000 10000 100000 ~1million Aggregate Data Rate (Events/sec.) 3 Example Scenarios Manufacturing: • Sensor on plant floor • React through device controllers • Aggregated data • 10,000 events/sec Web Analytics: • Click-stream data • Online customer behavior • Page layout • 100,000 events /sec Financial Services: • Stock & news feeds • Algorithmic trading • Patterns over time • Super-low latency • 100,000 events /sec Power, Utilities: • Energy consumption • Outages • Smart grids • 100,000 events/sec Visual trend-line and KPI monitoring Batch & product management Automated anomaly detection Real-time customer segmentation Algorithmic trading Proactive condition-based maintenance Asset Specs & Parameters Stream Data Store & Archive Data Stream Data Stream Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds Event Processing Engine Lookup • Threshold queries • Event correlation from multiple sources • Pattern queries 4 StreamInsight Platform StreamInsight Application Development StreamInsight Application at Runtime Event sources Devices, Sensors Input Adapters StreamInsight Engine Output Adapters Event targets Pagers & Monitoring devices Standing Queries ` Web servers Query Logic Event stores & Databases Stock ticker, news feeds KPI Dashboards, SharePoint UI Query Logic Trading stations Query Logic Event stores & Databases What is Project “Austin”? • Real time data collection from wide variety of connected devices (Sensors, Smart Meters, Servers, Tablets, Phones) • Standards compliant endpoints (REST, XML, JSON) • Securable data ingress with data enrichment and transformation (geotagging, etc.) • Multi-tenant Azure service with flexible, elastic capacity for collection and analytics • Federated scale out collection and analytics • Distributed service monitoring and tracing • Turn key connectivity for platform data sources and sinks (SQL Azure, Windows Azure Table Storage) • Integrated with Azure management portal and billing experiences • Rich temporal (StreamInsight) and sequential (Reactive Framework) analytics models • Dynamic, flexible query and data source management experience StreamInsight on Azure: “Austin” StreamInsight Application Development StreamInsight Application at Runtime Prebuilt Input Adapters Austin StreamInsight Engine Standing Queries StreamInsight Query Scalable Data Ingress Adapter Authentication Built-in Archive Management Service Reactive Query StreamInsight Query Prebuilt Output Adapters Data Egress Adapter Data Egress Adapter Monitoring Service Events Events expose different temporal characteristics Point in time events Interval events with fixed duration Interval events with initially unknown duration Payload/ value Rich payloads capture all properties of an event b c d e a t1 t2 t3 Time t4 t5 Event Types Events in Microsoft’s CEP platform use the .NET type system Events are structured and can have multiple fields Fields are typed using the .NET framework types CEP engine provisioned timestamp fields capture all the different temporal event characteristics Event sources populate time stamp fields Timestamps Long /Metadata pumpID … … String Type … String Location … Double flow … Double pressure … Event Streams & Adapters A stream is a possibly infinite sequence of events Insertions of new events Changes to event durations Stream characteristics: Event/data arrival patterns Steady rate with end-of-stream indication Intermittent, random, or in bursts Out of order events: Order of arrival of events does not match the order of their application timestamps Adapters Receive/get events from the data source Enqueue events for processing in the engine 10 Typical CEP Queries Typical CEP queries require combination of functionality Complex type describes event properties Calculations introduce additional event properties Grouping by one or more event properties Aggregation for each event group over a pre-defined period of time, typically a window Multiple event groups monitored by the same query Correlate event streams Check for absence of activity with a data source Enrich events with reference data Collection of assets may change over time We want to make writing and maintaining those queries easy or even effortless StreamInsight Query Features Operators over streams Calculations (PROJECT) Correlation of streams from different data sources (JOIN) Check for absence of activity with a data source (EXISTS) Selection of events from streams (FILTER) Stream partitioning (GROUP & APPLY) Aggregation (SUM, COUNT, …) Ranking and heavy hitters (TOP-K) Temporal operations: hopping window, sliding window Extensibility – to add new domain-specific operators LINQ Query Examples LINQ Example – JOIN, PROJECT, FILTER: from e1 in MyStream1 join e2 in MyStream2 on e1.ID equals e2.ID where e1.f2 == “foo” select new { e1.f1, e2.f4 }; Join Filter Project LINQ Example – GROUP&APPLY, WINDOW: from e3 in MyStream3 group e3 by e3.i into SubStream from win in SubStream.HoppingWindow( FiveMinutes,ThreeSeconds) select new { i = SubStream.Key, a = win.Avg(e => e.f) }; Grouping Window Project & Aggregate Extensibility SDK Built-in operators do not cover all functionality Need for domain-specific extensions Integrate with functionality from existing libraries Support for extensions in the CEP platform: User-defined operators, functions, aggregates Code written in .NET, deployed as .NET assembly Query operators and LINQ can refer to functionality of the assembly Temporal snap-shot operator framework Interface to implement user-defined operators Manages operator state and snapshot changes Framework does the heavy lifting to deal with intricate temporal behavior such as out-of-order events Resiliency Outages happen in computing Power outages “Patch Tuesday” Human mistakes Planned and unplanned downtime Systems need to be “resilient” to outages Minimize damage Become operational again quickly The specific requirements depend on how mission critical your applications is Resiliency: Timeliness Timeliness: recover from outages quickly. Goal is simple: as fast as possible. StreamInsight doesn’t store event data, but it does store query state. This may be significant. This may be slow to recreate. Resiliency: Correctness What is Checkpointing? Checkpointing saves a query’s state to disk. You control when the checkpoint is initiated. SI takes care of saving out consistent state. After an outage, StreamInsight can restore this state. This limits state loss during an outage, speeding recovery. Level of correctness depends on additional work we are able to perform. Recovery process is coordinated by SI. Checkpointing API public IAsyncResult server.BeginCheckpoint( Query query, AsyncCallback asyncCallback, object asyncState); public bool server.EndCheckpoint( IAsyncResult asyncResult); public void server.CancelCheckpoint( IAsyncResult asyncResult); When is Checkpointing Useful? Provides a mechanism to recover from an outage: To recover from unexpected system failure. To handle expected outages (e.g., patch Tuesday). For machine migration. Not a panacea: Does not provide uninterrupted service. Does not protect against broken query logic. Using Checkpoints We’ll walk through the three progressivelystrict checkpointing scenarios: 1. 2. 3. State retention. Equivalent events. Exact equivalence. Low Bar: State Retention Ideal output: A B C D E F G H … F’ G’ H’ … Real output: A B Checkpointing c d e f g h i j … Enqueue markers into input streams to instruct operators to save their state. … c d e f g h i j Checkpointing oops c d e f g h i j … … c d e f g h i j Recovery g h i j k l m n… Load saved operator state and then start consuming input. … g h i j k l mn Medium Bar: Equivalent Events Ideal output: A B C D E F G H … B C D … Real output: A B Filling the Gaps StreamInsight needs help: Missing state since last checkpoint. Missed events during outage. Solution: replayable adapters. The dance: 1. 2. 3. StreamInsight picks a place in the input stream. StreamInsight communicates this to the input adapter. The input adapter replays from the chosen spot. Checkpointing d ef g c e d ef g hf h gi hij kji kjl … … d ef g gf g hi hji kij kjl e d c ef h Recovery e f g h i j k l … e f g h i j k l … A Place in the Stream 8 Time Application High Water Mark 7 6 5 4 3 2 1 0 a b c d e f g Physical Stream h … Communicating the State Input adapter factories can optionally implement one of IHighWaterMarkInputAdapterFactory IHighWaterMarkTypedInputAdapterFactory In a recovery situation, StreamInsight will then call Create with a high-water mark. The factory is then responsible for properly cueing the input. StreamInsight in Action Internet of Things Demo The Demo StreamInsight “Austin” StreamInsight Design Principles Scalability – Aggregate data rate keeps increasing. Minimum resources impact (co-located). Local computation Avoid flooding the network Programmability Extensibility – UserDefinedAggregates, UserDefinedFunctions, UserDefinedOperators. Composability. Developer experience (language, IDE, debugging, supportability) Adaptablity Easy to integrate via adapters. Portability (servers, edge devices) 34 StreamInsight Architecture Host Process Web Service Engine Management Service Command Dispatcher Runtime Adapters Compiler Expression / Type Service Stream OS Execution Operators Stream Manager Plan Manager Query Scheduler Synopsis Event Manager Metadata Diagnostics / Tracing 35 Management Service Host Process Web Service Engine Management Service Command Dispatcher Highlights Runtime • Manageability API for query management (i.e. Compiler query) andStream supportability / Adapter create, start, stop, delete Execution Plan s Operators Manager Manager monitoring of running queries • Same manageability API for both embedded deployment and web service clients Expressio Query Event n / Type Service Stream OS Scheduler Synopsis Metadata Manager Diagnostics / Tracing Compiler & Expressions Host Process Web Service Engine Highlights Management Service Adapter s Compiler Expression / Type Service • Standardized IL allows us to implement a variety of syntacticCommand surfacesDispatcher over the algebra - e.g., LINQ, CQL, etc. • Allows for domain-specific front-end languages • Prepared for future extensions Runtimetime type checking and type safe code generation • Compile for minimal runtime impact. Execution Stream UDOs. Plan • Support for UDF’s, UDAggs, Operators Manager Manager • JIT code generation for field references , expression evaluation for low latency processing of high event rates. • Basing on CLR helps leverage – Querygenerator, JIT support Event • Code Synopsis Scheduler Manager • Type System • Tools and Libraries (LINQ Expressions, IDE, etc.) Stream OS Metadata Diagnostics / Tracing Events & Streams Highlights Host Process Web Service • JIT code generation for field references, expression evaluation because interpreting these references is sub-optimal for low latency processing of high event rates. • Leverage JIT code generation support in CLR runtime for LINQ expressions. • Bind the query to different deployment environments based on the metadata. Management of Service • Event manager is implemented as a combination managed and native code in order to minimize overhead and ensure predictable performance. • Events are read-only and referenced-counted byDispatcher streams (minimize data copying) Command Engine Runtime Adapter s Compiler Expressio n / Type Service Stream OS Execution Operators Stream Manager Plan Manager Query Scheduler Synopsis Event Manager Metadata Diagnostics / Tracing Query Scheduler Host Process . Engine Web Service Management Service Command Dispatcher Highlights Runtime • A query is executed by scheduling the individual operators as they become active. Compiler Adapter Execution Stream Plan • Operator state transition is managed by the Scheduler. s Operators Manager Manager • When an operator becomes active a thread is scheduled for execution. • Scheduling decision based on priority of the query and other parameters. • Data flow architecture: Expressio reduced couplingQuery and pipeline parallelismEvent n / Type Synopsis • Operators are affinitizedService to a thread/core (multi-core environments) to decrease Scheduler Manager lock contention and increase caching benefits. Periodic checks and migration for load balancing Stream OS Metadata Diagnostics / Tracing Execution Operators XYZ Host Process Web Service Union X,Y,Z YYY Engine Apply BBB Apply ABC Group A,B,C Adapter s Apply Highlights Management Service • Efficient implementation of operators that Command performDispatcher incremental evaluation as each event is processed. •Runtime Clean, formal semantics. Leverage Compiler relational semantics whenever possible. Execution Stream Plan • GroupAndApply OperatorManager Operators Manager • Enables parallelism for scale-up (multi-core). Expressio Query Event n / Type Synopsis • Groups are dynamically Scheduler Manager Service instantiated and torn down based upon the data. Large numbers of Diagnostics / Stream OS groupsMetadata can be simultaneously active. Tracing (~50M active groups for MSN.com) The StreamInsight Team Founded in 2008 based on incubation between MSR and SQL teams Small team – by Microsoft standards Roles in Microsoft engineering teams Program Managers: customer scenarios, functional specs, APIs, project mgmt, evangelism Developers: architecture, technical design, product code, unit tests Testers: test breakout, test code, lab runs, release signoff Using agile development methods StreamInsight Roadmap StreamInsight 2.1 (on prem) Development experience Major API overhaul • • • • StreamInsight on Azure (Cloud) StreamInsight service on Windows Azure Currently private CTP GA this summer Using Scrum to organize and manage schedules Work organized in sprints/milestones CTP (Community Technology Preview) after each milestone – similar to public beta TAP (Technology Adopter Program) as we get closer to the planned release For More Information StreamInsight download location: http://go.microsoft.com/fwlink/?LinkId=160598 StreamInsight blog: http://blogs.msdn.com/streaminsight/ StreamInsight MSDN documentation: http://msdn.microsoft.com/enus/library/ee362541(SQL.105).aspx StreamInsight MSDN portal: http://msdn.microsoft.com/enus/ee476990.aspx