CMSC828K: Sensor Data Management; Data Streams Amol Deshpande Today.. Introductions Overview of the syllabus What ? Why ? Grading/Class requirements etc... No laptops in the class Why ? Emergence of sensing devices that can be networked together on a large scale New data management challenges Very high-rate “data streams” Uncertain/imprecise data New types of queries We need to develop techniques to handle such data A Sensor A device that can “sense” things sense = measure = instrument Examples: Traffic Sensors Location Sensors e.g. traffic cameras e.g. cell phones, GPS units Sensors sensing environmental properties e.g. temperature, humidity, light etc Sensor Network A collection of sensing devices that can communicate with each other Can collectively measure or instrument a large scale phenomenon or property Increasing number of deployments everywhere Fueled mainly by developments in MEMS MICA2 Mote (Berkeley mote) Types of Sensors Temperature Variable resistors that change resistance with temp Types of Sensors Photocells, fog detectors MEMS – Micro-electronic Mechanical Systems • Sensors attach via daughtercard Acceloremeters, gyroscopes, tilt sensors Soil moisture (electrical resistance) GPS •Weather Acoustic (microphones) –Temperature –2 or 3 axis Cameras accelerometers –Light ?x 2 (high intensity PAR, low •Tracking intensity, full –Microphone (for ranging spectrum) •Vibration Types of Sensors Wireless camera sensors ? Several projects underway CMOS based imaging sensors Very low power, but very low resolution Cyclops (@UCLA) CMUCam WSNs: Vital Signs Monitoring Project Codeblue: Harvard Accelerometer, gyroscope, and electromyogram (EMG) sensor for stroke patient monitoring. Pluto mote with a 3-axis accelerometer Can monitor the vital signs (for months on 2 batteries), and transmit it wirelessly. Can be used to detect anomalous behavior, to raise alarms etc… RFIDs (“Smart Labels”) Identify objects from distance Wireless energy supply ~100 bytes Cost ~ $0.1 ... $1 ~1m, magnetic field (induction) ROM or EEPROM (writeable) small IC with RF-transponder Consumable and disposable Flexible tags laminated with paper RFIDs (“Smart Labels”) RFID: Increasingly number of deployments Supply chain management Tracking Luggage, medical equipment, clothes ... Activity recognition RFID Tags + iGlove “Wired” Sensor Networks/Macro-scopes Traffic/GPS sensors, video cameras, microphones… Somewhat different challenges Disaster management, Surveillance, Tracking, Activity detection Especially for audio/video sensors But a lot of commonalities... Thanks to Gutemberg Bezerra Applications.. Habitat monitoring Elder care Seismic structure monitoring Home automation Contamination tracking Measuring pollutants Location-based services Traffic monitoring Supply-chain management (RFID) Precision Agriculture/Nursery Object tracking Smart environments Surveillance Industrial Monitoring Interactive Museums Battlefield applications Swimming Pool Monitoring etc... etc... Challenges Hardware platforms Started with the vision of “smart dust” Power consumption still an issue Battery power doesn’t obey Moore’s law Reliability Deployments in extreme conditions Need to autonomously deal with failures Challenges Programming interfaces/abstractions Still too much variety in the platforms... Networking for wireless sensors Reliable multi-hop is tricky Inherently lossy channel Routing protocols, connectivity Must deal with mobility, failures Localization, synchronization Challenges Security/Privacy A very important issue... Monitoring has been happening, and will continue, no matter what WSNs just make it significantly easier Who controls the data ? Who sees it ? Challenges Finally... the focus of the class... Data management challenges Data generated in real-time and continuously (distributed data streams) Traditional database systems have significantly more static data Tremendous amounts of data generated Much of it useless, but need to process all Must be processed immediately Data management challenges Typically acquisitional environments Data is not gathered/sensed until asked for Should carefully decide what to “acquire” Required to conserve power as well as for sanity Changes query processing quite fundamentally “Acquisitional” QP Data management challenges Raw data is inherently uncertain Lossy: sensor, communication link failures Imprecise: errors in the sensing Uncertain/probabilistic: fundamental limitations in the way sensing is done Use of statistical models fundamental “Probabilistic Databases” ? Data management challenges Need for real-time statistical modeling Event/anomaly/pattern detection Removing noise from the data Spatial/temporal biases in the data … Data management challenges Data provenance Being able to trace something back to its origins Data exploration and visualization Managing large-scale spatio-temporal datasets Data interoperability Data security and privacy … Data management challenges Combination of all these factors has made this a very challenging and exciting research area.... Multi-disciplinary solutions required Databases + Machine Learning + Networking Class Class based on reading papers... Sorta-classic papers exist for two topics Wireless sensor networks Data streams Not so much about probabilistic modeling, uncertain data etc… Schedule on the web.... Class Forum Outline Overview of sensing technologies, hardware trends, applications: 2-3 classes Declarative query processing in sensornets: 2-3 classes Data streams: system design, query processing and optimization, adaptive query processing: 7-8 classes Probabilistic graphical models and their role in sensor data management: 4-5 classes Uncertain, probabilistic databases: 5-6 classes Tentative schedule Class Structure Before each class, email reading summaries to me Include “828 Summary” in the subject. Each class, presentation followed by discussion about the papers Most by me Some by you If you plan to attend regularly but are not enrolled, you should consider doing a presentation as well Grading Summaries: 10 Participation + Presentation: 10 Project: 40 Literature survey Intermediate progress report Final report + presentation (or maybe poster) Homework/Exam: 40 Remember, this is a graduate class Thats all... Questions ? Wireless Sensing Devices LWIM III AWAIRS I UCLA, 1996 UCLA/RSC 1998 Geophone, RFM Geophone, DS/SS radio, PIC, star Radio, strongARM, network Multi-hop networks Sensor Mote Medusa, MK-2 UCB, 2000 UCLA NESL RFM radio, 2002 Atmel Predecessors in DARPA Packet Radio program USC-ISI Distributed Sensor Network Project (DSN) Examples of Wireless SNs • Ecosystems, Biocomplexity Marine Microorganisms • Micro-sensors, onboard processing, wireless interfaces feasible at very small scale--can monitor phenomena Ņup closeÓ Contaminant Transport Enables spatially and temporally dense environmental monitoring Seismic Structure Response Embedded Networked Sensing will reveal previously unobservable phenomena