DBI303: Microsoft Stream Insight: Introduction to Complex Event

advertisement
DBI303
SELECT COUNT(*) FROM ParkingLot
WHERE type = ‘AUTO’
AND color = ‘RED’
red cars
last hour
Doesn’t seem like a great solution…
This is the streaming data paradigm in a nutshell –
ask questions about data in flight.
$ value of analytics
Web Analytics – Ad placement,
Financial Services, Smart Grids,
Monitoring – Systems mgmt, Health Care,
Manufacturing, etc.
Forecasting in Enterprises
Historical Trend Analysis
years
months
days
Time of interest
hrs
min
sec
Present
Sources
Caching
Processing
Distribution
Visualization
Refresh (Push)
Operational
Analytics
Cache
Reference
Data
Data Bus
Web servers
Microsoft
StreamInsight
Automated
Decisions
Message Bus
Devices, Sensors
Operational Dashboard
(Ticking - Snapshot)
Refresh (Push)
Reporting Dashboard
(Refreshed)
Relational
Database
ETL
Intra-Day
Cubes
Stock tickers &
News feeds
Service Broker
Static Reports
Re-compute
(Pull)
ETL
Historic
Cubes
Mining, Validation,
“What-If” Scenarios
Analytical results need to reflect important changes in business reality immediately and enable
responses to them with minimal latency
Database Applications
Event-driven Applications
Query Paradigm
Ad-hoc queries or requests
Continuous standing queries
Latency
Seconds, hours, days
Milliseconds or less
Data Rate
Hundreds of events/sec
Tens of thousands of events/sec or more
Query Semantics
Declarative relational analytics
Declarative relational and temporal analytics
request
response
Event
input
stream
output
stream
Latency
Months
StreamInsight Target Scenarios
Days
Relational Database Applications
hours
Operational Analytics Applications,
e.g., Logistics, etc.
Data Warehousing
Applications
Web Analytics Applications
Minutes
Seconds
100 ms
Monitoring
Applications
Manufacturing Applications
< 1ms
0
10
100
1000
10000
Aggregate Data Rate (Events/sec.)
Financial trading
Applications
100000
~1million
StreamInsight
Application Development
StreamInsight Application at Runtime
Event sources
Devices, Sensors
Input
Adapters
StreamInsight Engine
Output
Adapters
Event targets
Pagers &
Monitoring devices
Standing Queries
`
Web servers
Query
Logic
Event stores &
Databases
Stock ticker, news feeds
Query
Logic
KPI
Dashboards,
SharePoint UI
Trading stations
Query
Logic
Event stores & Databases
Industry trends
CEP advantage
• Data acquisition
costs are negligible
• Process data
incrementally,
i.e., while it is in
flight
• Avoid loading
while still doing
the processing
you want
• Seamless
querying for
monitoring,
managing and
mining
• Raw storage costs
are small and
continue to
decrease
• Processing costs
are non-negligible
Monitor
KPIs
Record raw
data (history)
Manage
business via
KPI-triggered
actions
• Data loading costs
continue to be
significant
Mine historical data
Devise new KPIs
Manufacturing:
• Sensor on plant floor
• React through device
controllers
• Aggregated data
• 10,000 events/sec
Web Analytics:
• Click-stream data
• Online customer
behavior
• Page layout
• 100,000 events /sec
Financial Services:
• Stock & news feeds
• Algorithmic trading
• Patterns over time
• Super-low latency
• 100,000 events /sec
Power Utilities:
• Energy consumption
• Outages
• Smart grids
• 100,000 events/sec
Visual trend-line and KPI monitoring
Batch & product management
Automated anomaly detection
Real-time customer segmentation
Algorithmic trading
Proactive condition-based maintenance
Asset Specs &
Parameters
Stream Data Store
& Archive
Data Stream
Data Stream
Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds
Event Processing Engine
Lookup
• Threshold queries
• Event correlation from
multiple sources
• Pattern queries
Push
StreamInsight
Grouping
Aggregati
on
Output Adapters
Input Adapters
Market
Feed:
-MSFT
-IBM
-etc.
Push
Push
Asset
Class
Ticker
Exchange
SUM
Volume
SUM
Bid
SUM
Ask
Stock
MSFT
NASDAQ
100
100
100
Stock
IBM
NASDAQ
200
200
200
Pull
Pull
Temporal
LINQ Example – JOIN, PROJECT, FILTER:
from e1 in MyStream1
join e2 in MyStream2
on e1.ID equals e2.ID
where e1.f2 == “foo”
select new { e1.f1, e2.f4 };
Join
Filter
Project
LINQ Example – GROUP&APPLY, WINDOW:
from e3 in MyStream3
group e3 by e3.i into SubStream
from win in SubStream.HoppingWindow(
FiveMinutes,ThreeSeconds)
select new { i = SubStream.Key,
a = win.Avg(e => e.f) };
Grouping
Window
Project &
Aggregate
Web servers
Data Sources
StreamInsight
Sensors
StreamInsight
Devices
Feeds
Aggregation &
Correlation
StreamInsight
StreamInsight
StreamInsight
StreamInsight
StreamInsight
StreamInsight
Complex Analytics &
Mining
Event processing engines are deployed
at multiple places on different scales:
• At the edge
close to the data source
• In the mid-tier
consolidate related data sources
• In the data center
historical archive, mining, large scale
correlation
StreamInsight
CEP for lightweight processing and filtering
StreamInsight
CEP for aggregation and correlation
StreamInsight
CEP for complex analytics including historical data
Parallel Data
Warehouse
Workload
Standard
Enterprise
Datacenter
Custom/Packaged OLTP Apps
4 procs,
64GB RAM,
Backup Compression
8 procs,
2TB RAM,
Adv. Security,
Backup Compression
>8 procs,
OS Max,
Adv. Security,
Backup Compression
N/A
1 VM/license
4 VMs/license,
Resource Governor
App & Multi-Server Mgmt
(up to 25 instances)
Unlimited Virtualization, Resource
Governor, App & Multi-Server Mgmt
(> 25 instances)
N/A
Scale-Up DW,
Data Compression
Scale-Up DW,
Data Compression
Scale-Out DW
10s of TBs, Up to 30 TB
with FastTrack
10s of TBs
10s - 100s of TBs
Server Consolidation
Data Warehousing
Business Intelligence
Dept/Team BI
Enterprise-Scale BI,
Master Data Services, PowerPivot
Mgmt
Enterprise-Scale BI, Master Data
Services, PowerPivot Mgmt
Complex Event Processing
(StreamInsight)
<5000 events/sec &
> 5 sec latency
<5000 events/sec &
> 5 s latency
>5000 events/sec &
< 5 s latency
Integrated with SSIS,
SSAS and SSRS
Future coverage
Scenarios:
Manufacturing
Utilities
Oil & Gas
Financial
Services
Web Analytics
Telco
Alarming
AMI/SmartGrid
Well
Monitoring
Risk
Management
Behavioral
Targeting
CDR
Aggregation
Notifications
Outage
Management
Operational
Intelligence
Market
Monitoring
Load Monitoring
Real-Time Analysis
ISV:
OSIsoft
Matrikon
ICONICS
OSIsoft
Matrikon
Telvent
ICONICS
OSIsoft
Matrikon
Lab49
IMGroup
MSFT AdCenter
XBox
DPE
SI:
Logica
Logica
Logica
Hitachi
Consulting
Lab49
IMGroup
MSFT AdCenter
XBox
DPE
DevelopmentStreamInsight
experience with .NET,
Application
Development
C#, LINQ
and Visual
Studio 2008
and 2010
Event sources
Devices, Sensors
Support for
.NET
sequences as
Web servers and
sources
sinks;
Flexible
adapter SDK to
Eventconnect
stores & Databases
to
other event
sources and
sinks
Stock ticker, news feeds
StreamInsight Application at Runtime
CEP platform from Microsoft
to build event-driven applications
Input
Adapters
StreamInsight Engine
Event targets
Output
Adapters
Standing Queries
Query
Query
Event-driven applications
are
fundamentally
Logic
Logic
different from traditional database applications:
queries are continuous, consume and produce
streams, and compute results incrementally
Query
Logic
Pagers &
Monitoring devices
The CEP
KPI Dashboards,
platform
does
` SharePoint UI
the heavy
lifting for you
to deal with
temporal
Trading stations
characteristics
of event stream
data
Event stores & Databases
product main page
blog
Hitchhiker’s guide
Blog post with download location
MSDN documentation
samples
http://northamerica.msteched.com
www.microsoft.com/teched
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn
Download