CMSC828K: Sensor Data Management; Data Streams Amol Deshpande

advertisement
CMSC828K: Sensor Data
Management; Data Streams
Amol Deshpande
Today..


Introductions
Overview of the syllabus

What ? Why ?

Grading/Class requirements etc...

No laptops in the class
Why ?

Emergence of sensing devices that can be
networked together on a large scale

New data management challenges


Very high-rate “data streams”

Uncertain/imprecise data

New types of queries
We need to develop techniques to handle
such data
A Sensor

A device that can “sense” things


sense = measure = instrument
Examples:

Traffic Sensors


Location Sensors


e.g. traffic cameras
e.g. cell phones, GPS units
Sensors sensing environmental properties

e.g. temperature, humidity, light etc
Sensor Network

A collection of sensing devices that can communicate with
each other

Can collectively measure or instrument a large scale
phenomenon or property

Increasing number of deployments everywhere

Fueled mainly by developments in MEMS
MICA2 Mote (Berkeley mote)
Types of Sensors

Temperature

Variable resistors that change resistance with temp
Types of Sensors

Photocells, fog detectors

MEMS – Micro-electronic Mechanical Systems

• Sensors attach via daughtercard
Acceloremeters, gyroscopes, tilt sensors

Soil moisture (electrical resistance)

GPS
•Weather

Acoustic
(microphones)
–Temperature
–2 or 3 axis
Cameras
accelerometers
–Light ?x 2 (high
intensity PAR, low
•Tracking
intensity, full
–Microphone (for ranging
spectrum)

•Vibration
Types of Sensors

Wireless camera sensors ?



Several projects underway
CMOS based imaging sensors
Very low power, but very low resolution
Cyclops (@UCLA)
CMUCam
WSNs: Vital Signs Monitoring
Project Codeblue: Harvard
Accelerometer, gyroscope, and
electromyogram (EMG) sensor
for stroke patient monitoring.
Pluto mote with a 3-axis accelerometer
Can monitor the vital signs (for months on 2 batteries), and transmit it wirelessly.
Can be used to detect anomalous behavior, to raise alarms etc…
RFIDs (“Smart Labels”)

Identify objects from distance


Wireless energy supply


~100 bytes
Cost ~ $0.1 ... $1


~1m, magnetic field (induction)
ROM or EEPROM (writeable)


small IC with RF-transponder
Consumable and disposable
Flexible tags

laminated with paper
RFIDs (“Smart Labels”)

RFID: Increasingly number of deployments



Supply chain management
Tracking Luggage, medical equipment, clothes ...
Activity recognition

RFID Tags + iGlove
“Wired” Sensor Networks/Macro-scopes

Traffic/GPS sensors, video cameras, microphones…


Somewhat different challenges


Disaster management, Surveillance, Tracking, Activity detection
Especially for audio/video sensors
But a lot of commonalities...
Thanks to Gutemberg Bezerra
Applications..
Habitat monitoring
Elder care
Seismic structure monitoring
Home automation
Contamination tracking
Measuring pollutants
Location-based services
Traffic monitoring
Supply-chain management
(RFID)
Precision Agriculture/Nursery
Object tracking
Smart environments
Surveillance
Industrial Monitoring
Interactive Museums
Battlefield applications
Swimming Pool Monitoring
etc... etc...
Challenges

Hardware platforms

Started with the vision of “smart dust”

Power consumption still an issue


Battery power doesn’t obey Moore’s law
Reliability

Deployments in extreme conditions

Need to autonomously deal with failures
Challenges

Programming interfaces/abstractions


Still too much variety in the platforms...
Networking for wireless sensors

Reliable multi-hop is tricky

Inherently lossy channel

Routing protocols, connectivity


Must deal with mobility, failures
Localization, synchronization
Challenges

Security/Privacy

A very important issue...

Monitoring has been happening, and will continue,
no matter what


WSNs just make it significantly easier
Who controls the data ? Who sees it ?
Challenges

Finally...

the focus of the class...
Data management challenges

Data generated in real-time and continuously
(distributed data streams)

Traditional database systems have significantly
more static data

Tremendous amounts of data generated

Much of it useless, but need to process all

Must be processed immediately
Data management challenges

Typically acquisitional environments

Data is not gathered/sensed until asked for

Should carefully decide what to “acquire”

Required to conserve power as well as for sanity

Changes query processing quite fundamentally

“Acquisitional” QP
Data management challenges

Raw data is inherently uncertain

Lossy: sensor, communication link failures

Imprecise: errors in the sensing

Uncertain/probabilistic: fundamental limitations in
the way sensing is done


Use of statistical models fundamental
“Probabilistic Databases” ?
Data management challenges

Need for real-time statistical modeling

Event/anomaly/pattern detection

Removing noise from the data

Spatial/temporal biases in the data

…
Data management challenges

Data provenance

Being able to trace something back to its origins

Data exploration and visualization

Managing large-scale spatio-temporal datasets

Data interoperability

Data security and privacy

…
Data management challenges

Combination of all these factors has made
this a very challenging and exciting research
area....

Multi-disciplinary solutions required

Databases + Machine Learning + Networking
Class

Class based on reading papers...

Sorta-classic papers exist for two topics

Wireless sensor networks

Data streams

Not so much about probabilistic modeling, uncertain data
etc…

Schedule on the web....

Class Forum
Outline

Overview of sensing technologies, hardware trends,
applications: 2-3 classes

Declarative query processing in sensornets: 2-3 classes

Data streams: system design, query processing and
optimization, adaptive query processing: 7-8 classes

Probabilistic graphical models and their role in sensor data
management: 4-5 classes

Uncertain, probabilistic databases: 5-6 classes

Tentative schedule
Class Structure

Before each class, email reading summaries to me


Include “828 Summary” in the subject.
Each class, presentation followed by discussion about the
papers


Most by me

Some by you
If you plan to attend regularly but are not enrolled, you
should consider doing a presentation as well
Grading

Summaries: 10

Participation + Presentation: 10

Project: 40

Literature survey

Intermediate progress report

Final report + presentation (or maybe poster)

Homework/Exam: 40

Remember, this is a graduate class
Thats all...

Questions ?
Wireless Sensing Devices
LWIM III
AWAIRS I
UCLA, 1996
UCLA/RSC 1998
Geophone, RFM
Geophone, DS/SS
radio, PIC, star
Radio, strongARM,
network
Multi-hop networks
Sensor Mote
Medusa, MK-2
UCB, 2000
UCLA NESL
RFM radio,
2002
Atmel
Predecessors in
DARPA Packet Radio program
USC-ISI Distributed Sensor Network Project (DSN)
Examples of Wireless SNs
•
Ecosystems, Biocomplexity
Marine Microorganisms
•
Micro-sensors, onboard processing,
wireless interfaces
feasible at very small
scale--can monitor
phenomena Ņup closeÓ
Contaminant Transport
Enables spatially and
temporally dense
environmental
monitoring
Seismic Structure Response
Embedded Networked
Sensing will reveal
previously
unobservable
phenomena
Download