Distributed Systems: Introduction, Motivation & Overview

advertisement
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Big Events
Hans-Arno Jacobsen
Middleware Systems Research Group
MSRG.org
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Big Event Data
Traditional Big Data Domain vs.
Rest of Universe
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• There are other
emerging domains with
needs similar to Big
Data
– Smart grids
– Smart cities …
My first message: There
are other relevant Big
Data domains –
beware!
H.-A. Jacobsen
Smart Grids for Taming
The Energy Problem
H.-A. Jacobsen
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Relevance of Smart Grids
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Increasing penetration of variable
renewable energy sources like
wind and solar et al.
• Paradigm shift from demand-following supply
to supply-following demand
• Need for new large-scale information system
infrastructure to control demand
H.-A. Jacobsen
Distributed Generation, Flexible Loads
and Energy Storage
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Come in big numbers
• Show unique behavior (users, weather, equipment, …)
• Have to be monitored and controlled
 Big event data challenge
H.-A. Jacobsen
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Solar Photovoltaic Power Generation
air temperature
direct normal solar irradiance
1200
40
35
30
25
20
15
10
5
0
degrees celsius
1000
600
400
200
-200
~2.3 TB per year
and 1k panels
• High frequency
1
77
153
229
305
381
457
533
609
685
761
837
913
989
1065
1141
1217
1293
1369
0
1
74
147
220
293
366
439
512
585
658
731
804
877
950
1023
1096
1169
1242
1315
1388
watts/m^2
800
minutes
minutes
diffuse solar irradiance
250
watts/m^2
200
150
100
50
1
74
147
220
293
366
439
512
585
658
731
804
877
950
1023
1096
1169
1242
1315
1388
0
measurements required
• Several metrics of interest,
many spatially distributed
measurement points
minutes
Source: National Oceanic
& Atmospheric Administration (U.S.)
H.-A. Jacobsen
Use of PEVs as Grid Resource
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
GPS coordinates
0.06
Trip 1
Trip 2
Trip 3
Trip 4
Trip 5
Trip 6
0.04
dleta latitude
0.02
-0.2
0
-0.1 -0.02 0
0.1
0.2
0.3
0.4
0.5
0.6
-0.04
-0.06
-0.08
-0.1
-0.12
-0.14
-0.16
~ 0.5 TB per
year
• High frequency
and 1k vehicles
measurements required
delta longitude
• Important for SG
applications: Continuous
update of trip
destination and energy
level at destination
H.-A. Jacobsen
Source: Auto21 Project, University of Winnipeg
Electric Power Consumption
reactive power
kVAR
kwatts
real power
8
7
6
5
4
3
2
1
0
time
Volts
voltage
255
250
245
240
235
230
225
220
215
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
~ 27.5 PB per year
and 1k homes
Source: UCI Machine Learning Repository
Time
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
time
• Very high frequency measurements
required (e.g., for inferring device
on/off events, grid stability, etc.)
• Several metrics of interest (household
electricity meters, single devices, etc.)
H.-A. Jacobsen
Traditional Big Data Domain vs.
Rest of Universe
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
My second message:
Detecting events in realtime in the sea of Big
Data is just as
important.
H.-A. Jacobsen
Towards Big Events
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Many non-traditional scenarios that require
filtering of Big Events at large scales
• … scenarios that require filtering & storage of
events at large scales
• Filtering & storage of “event streams”
• Filtering & storage of “event showers”
H.-A. Jacobsen
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Event Showers vs. Event Streams
Event Showers
Event Stream Processing
• Partially ordered sets of events
• No single event schema
• Events vary in shape and size
from one to the next
• Processing of many event
expressions
• Tends to require support for
aggregation
• Broader model & paradigm
(dissemination, matching,
coordination)
• Linearly ordered event
sequences
• Schema-based, single schema
per stream
• Stream tuples follow schema
• More single-expression
processing-based
• Aggregation is a key
requirement
• Focused on processing
queries/expressions over
event streams
H.-A. Jacobsen
Conclusions
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Big Events are Big Data in motion
• Processing Big Data in real-time to detect
events of interest is important as well
• There are other emerging
application
My final message: Big
domains; let us watch out
forBenchmarking
them
Data
efforts should take this
into account.
H.-A. Jacobsen
Acknowledgements
• C. Goebel for help with smart grid slides
H.-A. Jacobsen
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Download