MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Big Events Hans-Arno Jacobsen Middleware Systems Research Group MSRG.org MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Big Event Data Traditional Big Data Domain vs. Rest of Universe MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • There are other emerging domains with needs similar to Big Data – Smart grids – Smart cities … My first message: There are other relevant Big Data domains – beware! H.-A. Jacobsen Smart Grids for Taming The Energy Problem H.-A. Jacobsen MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Relevance of Smart Grids MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Increasing penetration of variable renewable energy sources like wind and solar et al. • Paradigm shift from demand-following supply to supply-following demand • Need for new large-scale information system infrastructure to control demand H.-A. Jacobsen Distributed Generation, Flexible Loads and Energy Storage MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Come in big numbers • Show unique behavior (users, weather, equipment, …) • Have to be monitored and controlled Big event data challenge H.-A. Jacobsen MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Solar Photovoltaic Power Generation air temperature direct normal solar irradiance 1200 40 35 30 25 20 15 10 5 0 degrees celsius 1000 600 400 200 -200 ~2.3 TB per year and 1k panels • High frequency 1 77 153 229 305 381 457 533 609 685 761 837 913 989 1065 1141 1217 1293 1369 0 1 74 147 220 293 366 439 512 585 658 731 804 877 950 1023 1096 1169 1242 1315 1388 watts/m^2 800 minutes minutes diffuse solar irradiance 250 watts/m^2 200 150 100 50 1 74 147 220 293 366 439 512 585 658 731 804 877 950 1023 1096 1169 1242 1315 1388 0 measurements required • Several metrics of interest, many spatially distributed measurement points minutes Source: National Oceanic & Atmospheric Administration (U.S.) H.-A. Jacobsen Use of PEVs as Grid Resource MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG GPS coordinates 0.06 Trip 1 Trip 2 Trip 3 Trip 4 Trip 5 Trip 6 0.04 dleta latitude 0.02 -0.2 0 -0.1 -0.02 0 0.1 0.2 0.3 0.4 0.5 0.6 -0.04 -0.06 -0.08 -0.1 -0.12 -0.14 -0.16 ~ 0.5 TB per year • High frequency and 1k vehicles measurements required delta longitude • Important for SG applications: Continuous update of trip destination and energy level at destination H.-A. Jacobsen Source: Auto21 Project, University of Winnipeg Electric Power Consumption reactive power kVAR kwatts real power 8 7 6 5 4 3 2 1 0 time Volts voltage 255 250 245 240 235 230 225 220 215 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 ~ 27.5 PB per year and 1k homes Source: UCI Machine Learning Repository Time MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG time • Very high frequency measurements required (e.g., for inferring device on/off events, grid stability, etc.) • Several metrics of interest (household electricity meters, single devices, etc.) H.-A. Jacobsen Traditional Big Data Domain vs. Rest of Universe MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG My second message: Detecting events in realtime in the sea of Big Data is just as important. H.-A. Jacobsen Towards Big Events MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Many non-traditional scenarios that require filtering of Big Events at large scales • … scenarios that require filtering & storage of events at large scales • Filtering & storage of “event streams” • Filtering & storage of “event showers” H.-A. Jacobsen MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Event Showers vs. Event Streams Event Showers Event Stream Processing • Partially ordered sets of events • No single event schema • Events vary in shape and size from one to the next • Processing of many event expressions • Tends to require support for aggregation • Broader model & paradigm (dissemination, matching, coordination) • Linearly ordered event sequences • Schema-based, single schema per stream • Stream tuples follow schema • More single-expression processing-based • Aggregation is a key requirement • Focused on processing queries/expressions over event streams H.-A. Jacobsen Conclusions MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Big Events are Big Data in motion • Processing Big Data in real-time to detect events of interest is important as well • There are other emerging application My final message: Big domains; let us watch out forBenchmarking them Data efforts should take this into account. H.-A. Jacobsen Acknowledgements • C. Goebel for help with smart grid slides H.-A. Jacobsen MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG