Environmental Monitoring Data Challenges Jeremy Cohen Imperial College London

advertisement
Environmental Monitoring
Data Challenges
Jeremy Cohen
Imperial College London
Data Intensive Research Meeting
Monday 22nd November
Edinburgh, UK
Introduction!
•  Mobile detection of environmental pollution!
•  Focus on local air pollutants:!
•  Nitrogen Oxides (NOx, NO2), Sulphur Dioxide (SO2), Ozone
(O3), Carbon Monoxide (CO)!
•  VOCs (particularly Benzene (C6H6))!
Also consider other local environmental factors, e.g. noise,
humidity, temperature.!
The MESSAGE Project!
•  Mobile Environmental Sensing System Across a Grid Environment!
•  3 year project starting October 2006!
•  Funded jointly by EPSRC and DfT (~£4m), under
EPSRCʼs e-Science demonstration programme!
•  5 Universities, 19 industrial partners!
•  Pioneering combination and extension of leading
edge computing, sensor, communication and
positioning technologies!
•  Create radically new sensing infrastructure based
on combination of ad-hoc mobile and fixed sensors!
•  www.message-project.org!
!"##$%"&'&()*+,*)-&
./0,-&,1/&23,4156&17&1(8/)+1-9:&035&5/,-78)/5&/;4,5;<&8)443*)-&=>)5&78)57?&@144&
/;(,1-&,&A;6&8/)04;(&
B)/&;C,(84;:&DEF&1-&G)-<)-H&@144&@;&7;;&5>;7;&1(8/)+;(;-57&06&FIJIK&
!""#&
Ref: London Atmospheric
Emissions Inventory – (2006)!
!"$"&
!"##$%"&'&()*+,*)-&
L6&;C17*-9&75,-<,/<7&M;-5/,4&G)-<)-&17&/;4,*+;46&@;44&&()-15)/;<N&OPI&
QC;<&$R!&715;7S&&
L35&5>17&7*44&4;,+;7&>39;&9,87T&
U;7;,/M>&,-<&8)41M6&<;+;4)8(;-5&,/;&>,(8;/;<&
06&,&4,MA&)V&<,5,&)V&73WM1;-5&78,*,4&,-<&
K&5;(8)/,4&9/,-34,/156&
Sensor Devices!
•  Data captured and made available by a range of sensor devices!
•  Static and mobile!
•  Different pollutants captured!
•  Sample rates!
•  Different communication capabilities!
•  Computational power – sensors produce data according to
their computational capabilities!
Data Lifecycle: MESSAGE e-Science Architecture!
Data Lifecycle: Capture!
•  Data pre-processing!
•  On sensor devices where computational power available!
•  At gateway nodes within the network!
•  Distributed data mining!
•  QA!
•  Identification of potentially erroneous values!
•  Periodic sensor calibration – drift compensation applied at DB!
Data Lifecycle: Storage!
•  UTMC-based data store to support range of applications!
•  Interpolation!
•  Advance preparation of statistical data!
•  Outlier detection and interpretation!
•  Assembly of app-specific data marts!
•  Long-term warehousing of out-of-date data!
•  (How) Do we archive everything?!
Database infrastructure!
SQL!
Query!
XML!
CSV!
KML!
OGSA-DQP!
Controller!
•  Flexible interface for data insertion!
•  Single interface transparent access to
data across multiple databases!
OGSA-DAI!
Instance 1!
OGSA-DAI
Instance 2!
OGSA-DAI
Instance 3!
Data Store 1
Data Store 2
Data Store 3
•  Uses OGSA-DAI (www.ogsadai.org.uk) a
partner in OMII-UK!
•  Variety of data extraction formats!
•  Additional output formats may be added!
Data from sensors!
Data Lifecycle: Analysis, Processing and Visualisation!
•  Range of analytical processes that scientists may want to carry out!
•  Look at temporal and spatial variations of pollutant
concentrations (e.g. hotspot detection)!
•  Relationship between different pollutant species!
•  Correlation with other factors – e.g. traffic levels, weather,
health impacts!
•  System management and calibration – sensor control, fault
detection, etc.!
Data Lifecycle: Analysis, Processing and Visualisation!
•  Real-time vs. historic/predictive analysis!
•  Resource intensive since performance
critical!
•  How many clients do we need to support?!
•  3rd party providers may consume and
“resell”!
•  Aim for an interface/API that allows the
scientist to plug in their “application”!
Data Lifecycle: Analysis, Processing and Visualisation!
OGSA-DAI Web Query Example!
KML Data!
CSV Data!
Data Lifecycle: Analysis, Processing and Visualisation!
Data Lifecycle: Analysis, Processing and Visualisation!
•  Clicking a sensor provides
statistical information for that sensor!
•  Level meter provides shows
average pollution level for each
species!
•  Selected sensors identified by coloured ring!
•  Selected sensors display a
trace of recent readings in the
data stream history window!
Challenges!
•  Number of sensors!
•  Limited in our trials – up to ~40 sensors live at once!
•  Potentially many thousands+!
•  Significant variance in number of active sensors!
•  Data volumes!
•  Potentially very large – e.g. 0.5Mb per sensor/hour at 1Hz!
•  We have some control – e.g. dynamic variation of sample rates!
Challenges!
•  What do potential users want?!
•  Access to raw / pre-processed data streams?!
•  Access to services? !
•  Access to user-friendly interface?!
•  Measures of success!
•  Extensive follow-up work building on elements of this work!
•  Application in different domains – e.g. fleet management!
THANK YOU!
jeremy.cohen@imperial.ac.uk!
www.message-project.org!
With thanks to MESSAGE project sponsors and colleagues at Imperial College
London, University of Cambridge, University of Newcastle, University of Leeds and
University of Southampton who worked on the material shown in this presentation.!
Download