SafeQL

advertisement
HiPC 2003 Tutorial
System Support for Sensor Networks
Speakers:
Sharad Mehrotra, Univ. of California, Irvine
Nalini Venkatasubramanian, Univ. of California, Irvine
Rajesh Gupta, Univ. of California, San Diego
Quasar Group
Acknowledgements (for Slides)
• Nesime Tatbul Kevin Hoeschele, Anurag Shakti Maskey
(AURORA team)
• Jennifer Widom, Rajeev Motwani (STREAM)
• Sam Madden (TinyDB)
• Anantha Chandrakasan (MIT uAMPS TEAM)
• Qi Han, Iosif Lazaridis, Xingbo Yu (QUASAR team)
• Srini Seshan (Irisnet)
• Slides for tutorial available at
– http://www.ics.uci.edu/~quasar/tutorial/hipc.ppt
Quasar Group
Various Sensor Applications
Habitat Monitoring
Battlefield Monitoring
Earthquake Monitoring
Medical Condition
Monitoring
Sensor Networks
Video surveillance
Oceanographic
current monitoring
Intrusion Detection
Quasar Group
Target Tracking & Detection
Traffic Congestion Detection
Taxonomy of Applications (1)
• Data Access needs of applications
– Historical data
• Analysis to better understand the physical world
– Current data
• Monitoring and control to optimize the processes that
drive the physical world
– Future data
• Forecasting trend in data for decision making
Quasar Group
Taxonomy of Applications (2)
• Predictability of Data access
– Fixed
• data access needs of applications known a-priori
– Unpredictable (ad-hoc)
• Data access needs of applications not known at any
instance of time
– Predictable (continuous)
• Data access needs of applications can be predicted for
some time in the future with high probability
Quasar Group
Application Landscape
Temporal property of data
accessed
the
future
Predict noise levels
around the airport if
runway 2 becomes
operational
I’m going surfing on
Sep. 30! Will it be
windy?
Each evening at 8pm
predict the temperature
for the next 5 days
the
present
Visualize current
humidity with Mrs.
Doe’s new interpolation
scheme.
How much snow is
there in Aspen?
Notify me immediately
when there is a forest
fire
the
Is Mr. Doe’s newly
proposed weather
model accurate for
1996-2000?
Did the temperature
rise above 40oC in
the last year?
Every month, calculate
the average humidity in
California for the last 30
days
past
no knowledge
some knowledge
Predictability of data access
Quasar Group
full knowledge
Basic architecture of sensor nodes
Quasar Group
Sensor Properties – Different Capabilities
• Storage
– Built-in memory
• Sensing
• Computing
– Micro-processor or micro-controller
• Communication
– Short range radio for wireless communication
Quasar Group
Sensor Properties – Resource Constraints
• Lower transmission distances (< 10m)
• Lower bit rates (typically < kbps)
• Limited battery capacity
Radio mode
Power consumption(mw)
Transmit
14.88
Receive
12.50
Idle
12.36
Sleep
0.016
Quasar Group
Sensor Devices today
• A series of sensor nodes developed
• MIT uAMPS
– 59Mhz to 206 Mhz processor
– 2 radios , capable of transmitting at 1Mbps
– 4KB RAM
• Berkeley Mica motes
– 8bit, 4Mhz processor
– 40kbit CSMA radio
– 4KB RAM,
– TinyOS based
Quasar Group
Sensor OS Concepts
Quasar Group
msg_rec(type, data)
msg_sen
d_done)
Messaging Component
internal thread
Internal
State
TX_pack
et_done
(success
RX_pack
)et_done
(buffer)
• Very lean multithreading
• Efficient Layering
Events
send_msg
(addr,
type, data)
– frame per component, shared
stack, no heap
power(mode)
• Constrained Storage Model
init
– Event-based(?)
Commands
init
Power(mode)
TX_packet(buf)
• Constrained Scheduling
Sensor Network Properties
restricted
resources
frequent topology changes
and network partitions
prone to failure
small-scale
sensor nodes
fixed vs. mobile
sensor grids
infrastructure based vs.
ad-hoc communication
Quasar Group
environmental
influence
unattended
operation
node mobility
depleted
battery
concurrency
issues
dense deployment
in large numbers
heterogeneity
issues
scalability
issues
Controversies with sensor networks
• How is this different from mobile ubiquitous
computing?
• Network-centric vs. edge-centric
architecture?
– Passive sensors vs. smart sensors
• A new class of algorithms?
– Traditional deterministic vs. probabilistic vs.
epidemic
Quasar Group
Wireless Networked Embedded Systems
Characteristics
• Wireless
– limited bandwidth, high latency (3ms-100ms)
– variable link quality and link asymmetry due to noise,
interference, disconnections
– easier snooping
need for more signal and protocol processing
• Mobility
– causes variability in system design parameters: connectivity,
b/w, security domains, location awareness
need for more protocol processing
• Portability
– limited capacities (battery, CPU, I/O, storage, dimensions)
need for energy efficient signal and protocol processing
Quasar Group
Capacity of Wireless Sensor
Networks
• Sensor Networks
– nodes can sense (actuate), compute, communicate
• at the next level, these nodes and networks can infer, track,
correlate and correspond
– when such nodes can be composed, the application
possibilities can be wildly imaginative
• highly intelligent real-time distributed systems
• However, there are fundamental limits to scaling that
have to do with the ad hoc nature of such networks
– nodes building links and communicating (including relaying,
setup and discovery) without a central control
Quasar Group
Communication in Sensor Networks
• Questions we seek to answer
– How much information can wireless sensor networks transport?
• What can be done to maximize this transport?
– What is the right power level for transport?
• Where is this control (best) exercised?
– What is the appropriate network configuration
• Direct communication (single-hop)
• Multi-hop communication
– Directed diffusion , LAR, GF
• Cluster-based communication
– LEACH
Quasar Group
Challenges for Sensor Networks
Services for
localization,
discovery,
storage,
agreement
Integration of
communication
and application
specific
data processing
Challenges for
Sensor Networks
Automatic
configuration
& error handling
Quasar Group
Injection of
application
knowledge into
sensor network
infrastructure
Quality of
data/service
Guarantees under
resource
constraints
Time & location
management
Projects on Sensor Networks
Sensor OS
Network
related
ISI
UCLA
USC
SensIT
NEST
NEST
Stabilization
Ohio-state
Univ. of Iowa
Michigan state
Univ.
UT-Arlington
Kenn State Univ.
UC-Berkeley
MIT muOS
QoS in
Surveillance
and Control
MIT
Duke Univ.
Univ. of Hawaii
Univ. of Wisconsin
Northwestern Univ.
Penn State Univ.
Auburn Univ.
UIUC
Univ. of Virginia
CMU
WebDust
Rutgers
Quasar
UC-Irvine
Xerox
SmartDust
UC-Berkeley
Quasar Group
TinyDB
UCBerkeley
Aurora
Brown, MIT,
Brandeis Univ.
Cougar
Cornell
What are the Choices?
Quasar Group
Sensor networks
Wireless networks
Specialized
infrastructure
COTS
infrastructure
Smart sensors
Passive sensors
Probabilistic
guarantees
Deterministic
solutions
This tutorial – systems perspective
• Layered approach
– Device level
• Challenges in design of sensor devices and OSs
– Distributed sensor networks
• Challenges in managing large networks of sensors to
meet application requirements
– Sensor Database Management
• Challenges in Query Processing over sensor networks
Quasar Group
Design of sensor nodes
• Sensor Node Components
– Computation/communication tradeoff
• Energy Management within a sensor
– Computation/communication tradeoff
• Power-aware OS design for sensors
Quasar Group
Distributed Computing Infrastructure for Sensors
• Designing Distributed Sensor Architectures
– Server oriented -- data migrates to server from sensors
• Store or not store (stream)
• When should data migrate
• How should should data migrate in its original raw form or in
some aggregated form.
– Distributed approach
• Data does not migrate, requests/Queries migrate
• Tiny DB approach, Dimension Approach
• Designing Middleware Support for Sensor Networks
– Energy-Efficiency
– Real-time
– Fault tolerance
Quasar Group
Query Processing in Sensor Networks
• Queries Processing over Sensor Databases
– Taxonomy of queries
• Lifetime queries, aggregation queries, approximate queries, set
based queries
– Where do queries arise
• At the server, fully distributed at any node
– Query semantics
• What does a query mean? Exact semantics not very clear.
– Query Processing techniques
• Answering Approximate Queries over Approximate
Representation
• Answering Queries in the network
• Distributed Query Answering
• Data Stream processing & Dynamic Data
Quasar Group
Design Issues in Sensor Devices
HiPC 2003, Hyderabad, India
Quasar Group
Energy Availability Growth
limited to 2-3% per year
16x
Improvement (compared to year 0)
14x
12x
10x
8x
6x
4x
2x
1x
0
1
2
3
4
Time (years)
5
6
J. Rabaey, BWRC
Need to be energy efficient at all
Quasar Group
Computational Efficiency
• Speed power efficiency
has indeed gone up
– 10x / 2.5 years for Ps and
DSPs in 1990s
• between 100 mW/MIP to 1
mW/MIP since 1990
– IC processes have provided
10x / 8 years since 1965
– rest from power conscious
IC design in recent years
• Lower power for a given
function & performance
Quasar Group
Processor
P54VRT (Mobile)
P55VRT (Mobile MMX)
PowerPC 603e
PowerPC 604e
PowerPC 740 (G3)
PowerPC 750 (G3)
Mobile Celeron
MHz
150
233
300
350
300
300
333
Year SPECint-95 Watts
1996
4.6
3.8
1997
7.1
3.9
1997
7.4
3.5
1997
14.6
8
1998
12.2
3.4
1998
14
3.4
1999
13.1
8.6
However, circuit
gains are nearing a
plateau
– circuit tricks & voltage
scaling provided a
large part of the gains
• while energy needs
Efficiency in Communications
• Power Efficiency (or Energy Efficiency) P = Eb/N0
– ratio of signal energy per bit to noise power spectral
density required at the receiver for a certain BER
– high power efficiency requires low (E_b/N_0) needed for
a given BER
• Bandwidth Efficiency B = bit rate / bandwidth =
R_b/W bps/hz
– ratio of throughput data rate to bandwidth occupied by
the modulated signal (typically range from 0.33 to 5)
• Often a trade-off between the two
Quasar Group
Communication vs. Computation
• Computation cost (2004 projected): 60
pJ/op
• Minimum thermal energy for
communications:
– 20 nJ/bit @ 1.5 GHz for 100 m
• equivalent of 300 ops
– 2 nJ/bit @ 1.5 GHz for 10 m
• equivalent of 0.03 ops
significant processing versus
communication tradeoff
Quasar Group
J. Rabaey, BWRC
The Need
• Power consumption, energy efficiency is a
system level design concern
– efficiency in computation, communication and
networking subsystems
• The energy/power tradeoffs cut across
– all system layers: circuit, architecture, software,
algorithms
– need to choose the right metric
• Power awareness goes beyond low power
concerns
Quasar Group
Where does the Power Go?
Baseband DSP
Peripherals
Disk
Display
Processing
Programmable
Ps & DSPs
ASICs
(apps, protocols etc.)
Memory
Battery
DC-DC
Converter
Radio
Modem
Power Supply
RF
Transceiver
Communication
Signaling protocols, choice of modulation,
TX/RX architecture, RF/IF circuits
Quasar Group
Example 1: Power Measurements on
Rockwell WINS Node
Processor
Active
Active
Active
Active
Active
Active
Seismic Sensor
On
On
On
On
Removed
On
Radio
Rx
Idle
Sleep
Removed
Removed
Tx (36.3 mW)
Tx (27.5 mW)
Tx (19.1 mW)
Summary
Tx (13.8 mW)
• Processor =
Tx (10.0 mW)
Tx (3.47 mW)
360 mW
Tx (2.51 mW)
– doing
Tx (1.78 mW)
repeated
Tx (1.32 mW)
transmit/recei Tx (0.955 mW)
Tx (0.437 mW)
ve
Tx (0.302 mW)
• Sensor = 23
Tx (0.229 mW)
mW
Tx (0.158 mW)
Group
•Quasar
Processor
: Tx Tx (0.117 mW)
Power (mW)
751.6
727.5
416.3
383.3
360.0
1080.5
1033.3
986.0
942.6
910.9
815.5
807.5
799.5
791.5
787.5
775.5
773.9
772.7
771.5
771.1
Capabilities: vibration, acoustic,
accelerometer, magnetometer,
temperature sensing
GPS
Radio
Modem
Communication
Subsystem
Micro
Controller
Rest of the Node
CPU
Sensor
Power Consumption Notables
• Differences in radio “sleep” versus “shutdown” can be
significant
– need power management strategies at module/subsystem
level
• Generally RX power less than TX power.
• However, as TX get to lower power modes, under
some circumstances, it may be less than RX power
– particularly true in “sensor” type nodes
– need protocols that minimize listening needed
– need very low power “paging” channels for wakeup
• Processing can be a significant fraction of total power
– 30-50%
Quasar Group
Metrics for Power
• Absolute power (mW)
– sets battery life in hours
– problem: power  frequency (slow the system!)
• uW/MHz
– average energy consumed by the system
• Energy per operation
– fixes obvious problem with the power metric
– but can cheat by doing stuff that will slow the chip
– Energy/op = Power * Delay/op
• Metric should capture both energy and performance: e.g. Energy/Op *
Delay/Op
• Energy*Delay = Power*(Delay/Op)2
• Therefore:
– uW/MIPS: average energy per instruction
– uW/MIPS^2: normalizes uW/MIPS with the architectural performance
•
Quasar Group
useful for comparing architectures for power efficiency.
Node Level Power Management
• Choices: H/W, Firmware, OS, Application, Users
• Hardware & firmware
– don’t know the global state and application-specific knowledge
• Users
– don’t know component characteristics, and can’t make frequent
decisions
• Applications
– operate independently
– and the OS hides machine information from them
• OS is a reasonable place, but…
– OS should incorporate application information in power
management
– OS should expose power state and events to applications for them
to adapt.
Quasar Group
Operating System Directed Power
Management
• Significant opportunities in power management lie with
application-specific “knobs”
– quality of service, timing criticality of various functions
• OS plays an important role in allocation, sharing of critical
resource
– it is a logical place for dynamic power management
– application-specific constraints and opportunities for saving energy
that can be known only at that level
• Needs of applications are driving force for OS power
management functions & power-based API
– collaboration between applications and the OS in setting “energy
use policy”
• OS helps resolve conflicts and promote cooperation
Quasar Group
Slowdown by reducing supply
voltage – Dynamic Voltage Scaling
• Reduction in supply voltage reduces speed
• Reduce supply voltage when
– slower speed can be tolerated
– or use architectural techniques to combat slow
operation
• e.g. concurrency, pipelining via compiler techniques
Quasar Group
Shutdown for Energy Saving
– Shutdown attractive for many wireless applications
due to low duty cycle of many subsystems:
Blocked
“Off”
– Issues:
Tblock
Active
“On”
Tactive
• Cost of restarting: latency vs. power trade-off
ideal improvement
– increase in latency (response time)
– increase in power consumption due to startup
• When to Shutdown:
– Optimal vs.Idle Time Threshold vs. Predictive
• When to Wakeup:
– Optimal vs. On-demand vs. Predictive
• Two main approaches: (Reactive versus Predictive)
– “Go to Reduced Power Mode after the user has been idle for a few
seconds/minutes, and restart on demand”
Quasar Group
– “Use computation history to predict whether
Tblock[i] is large enough ( Tblock[i]  Tcost )”
To Shutdown or Reduce Voltage?
• Observation:
– better to lower voltage than to shutdown in case of digital logic
• Example: task with 100ms deadline, requires 50ms CPU time at full speed
– normal system gives 50ms computation, 50ms idle/stopped time
– half speed/voltage system gives 100ms computation, 0ms idle
– same number of CPU cycles but 1/4 energy reduction
• Voltage gets dictated by the tightest (critical) timing constraint both on
throughput and latency --> dynamically change voltage
– Use voltage to control the operating point on the power vs. speed curve
• I.e., power and clock frequency are functions of voltage
– Main challenge here is algorithmic:
• one has to schedule the voltage variation as well!
– via compiler or OS or hardware
Quasar Group
Current OSPM - ACPI
• Advanced Configuration and Power Management Interface (ACPI)
– OS visible (SCI-based) as opposed to OS invisible (SMI-based)
– OS/drivers/BIOS are in sync regarding power states
• Standard way for the system to describe its device config. & power
control h/w interface to the OS
– register interface for common functions
• system control events, processor power and clock control, thermal management,
and resume handling
• Info on devices, resources, & control mechanisms
– Description Tables, linked in a "table of tables"
– description data for each device:
•
•
•
•
Quasar Group
Power management capabilities and requirements
Methods for setting and getting the power state
Hardware resource settings
Methods for setting hardware resources
New power-aware interfaces required
• Provide ways by which Application, Operating System
and Hardware can exchange energy/power and
performance related information efficiently.
• Facilitate the continuously dialogue / adaptation
between OS / Applications.
• Facilitate the implementation of power aware OS
services by providing a software interface to low
power devices
– A power-aware API to the end user that enables one to
implement energy-efficient RTOS services and applications
Quasar Group
Power-aware API
The applications interface provides the following services:
• The application is able to
– tell RT information to OS (period, deadlines, WCET,
hardness)
– create new threads
– tell OS time predicted to finish a given task instance
• depending on the conditions of the environment (application
dependent and not yet implemented)
• OS must be able to predict and tell applications the
time estimated to finish the task
– depends on the scheduling scheme used
• A hard task must be killed if its deadline is missed.
Quasar Group
Power Management in Communication
Subsystems
Computation
Subsystem
Communication
Subsystem
e.g. Dynamic
Voltage/Freq.
Scaling
Power-aware
Task Scheduling
Modulation
coding
Power-aware
Packet Scheduling
OS/Middleware/Application
Quasar Group
Tiny OS Concepts
– frame per component, shared stack, no
heap
• Very lean multithreading
• Efficient Layering
Quasar Group
Messaging Component
internal thread
Internal
State
TX_pack
et_done
(success
RX_pack
)et_done
(buffer)
• Constrained Storage Model
Events
send_msg
(addr,
type, data)
Commands,
Event Handlers
Frame (storage)
Tasks (concurrency)
power(mode)
–
–
–
–
init
• Component:
Commands
init
Power(mode)
TX_packet(buf)
– constrained two-level scheduling model:
threads + events
msg_rec(type, data)
msg_sen
d_done)
• Scheduler + Graph of Components
application
Application = Graph of Components
Route map
router
sensor appln
packet
Radio byte
bit
Radio Packet
byte
Active Messages
RFM
Serial Packet
UART
Temp
ADC
photo
SW
HW
clocks
Example: ad hoc, multi-hop
routing of photo sensor
readings
3450 B code
226 B data
Graph of cooperating
state machines
on shared stack
Quasar Group
Part 2: Distributed Computing
Infrastructure for Sensor Applications
**Supported in part by a collaborative NSF ITR grant entitled “real-time data capture, analysis, and
querying of dynamic spatio-temporal events” in collaboration with UCLA, U. Maryland, U. Chicago
Quasar Group
Managing Distributed Sensor
Infrastructures
• A data collection and management middleware infrastructure
that
– provides seamless access to data dispersed across a hierarchy of
sensors, servers, and archives
– supports multiple concurrent applications of diverse types
– adapts to changing application needs
•
Fundamental Issues:
– Where to store data?
• do not store, at the producers, at the servers
– Where to compute?
• At the client, server, data producers
Quasar Group
Outline of this section
• Sensor network architectures
• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework
– Services that can be customized to address needs
• Case studies
– accuracy/cost tradeoffs in collection
– Accuracy/cost/timeliness tradeoffs in collection
– Storage/accuracy tradeoffs in archival
Quasar Group
Architectural Configurations
• Server-centric
• Streams
• Hierarchical
• Distributed
Quasar Group
Sensor Network Architectures – 1:
(server centric)
data/query request
data producers
server
data/query result
• Traditional data management
–
–
–
–
client-server architecture
efficient approaches to data storage & querying
query shipping versus data shipping
data changes with explicit update
• Limitations
– Sensors generate continuously changing data
• Producers must be considered as “first class” entities
– Does not exploit the storage, processing, and
communicating capabilities of sensors
Quasar Group
client
Sensor Network Architectures – 2:
streams
synopsis in memory
data streams
stream
processing
engine
continuous queries
(Approximate)
Answer
• Stream model
– Data streams through the server but is not stored
– Continuous queries evaluated against streaming data
– Deals with problems due to dynamic data on the server side
• Limitations
– Does not converse sensor resources (e.g., power)
– Does not exploit the storage and processing capabilities of sensors
– Geared towards continuous monitoring and not archival
applications
Quasar Group
Sensor Network Architectures – 3:
hierarchical
• Hierarchical architecture (e.g
Quasar)
client
server server cache
and archive
Producer & its cache
Quasar Group
QUERY FLOW
DATA FLOW
client cache
– data flows from producers to server
to clients periodically
– queries flow the other way:
• If client cache does not suffices, then
• query routed to appropriate server
• If server cache does not suffice, then
access current data at producer
– This is a logical architecture
• producers could also be clients
• A server may be a base station or a
(more) powerful sensor node
• Servers might themselves be
hierarchically organized
• The hierarchy might evolve over time
Sensor Network Architectures - 4:
Fully Distributed P2P
• Distributed architecture (e.g.
Dimensions)
PROGRESSIVELY LOSSY
Quasar Group
Level 1
Level 0
…
PROGRESSIVELY AGE
Level 2
– Store data at sensor nodes
– Construct distributed loadbalanced quad-tree hierarchy of
lossy wavelet-compressed
summaries corresponding to
different resolutions and spatiotemporal scales.
– Queries drill-down from root of
hierarchy to focus search on
small portions of the network.
– Progressively age summaries
for long-term storage and
graceful degradation of query
quality over time.
Outline of this section
• Sensor network architectures
• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework
– Services that can be customized to address needs
• Case studies
– accuracy/cost tradeoffs in collection
– Accuracy/cost/timeliness tradeoffs in collection
– Storage/accuracy tradeoffs in archival
Quasar Group
Balancing Tradeoffs in Application
Requirements
• Accuracy
– More accurate context results in better application
performance
– Very high accuracy may not be needed
• Cost
– Minimize resources consumed
• Network (messaging)
• Energy
• Storage
• Timeliness
– Late data may be useless
• Reliability
– Wrong/missing data may cause problems
Quasar Group
Data Representation
• Instantaneous value
• Range-based
– Static Interval
– Dynamic range-based
• Probabilistic distribution
– (mean, stdev) with decay
• Compressed formats
– wavelet
– histograms
– sketches
Quasar Group
What is accuracy?
• Resolution
– Temporal (Aurora)
• 1 value for a sliding window of size 5
• Load-shedding, subsetting
– Spatial (ask Iosif about wkshp paper)
• 1 value for a given region of dimension [x.y]
• Value laxity (Quasar)
– Value represented as an interval
• 9 represented as [6,12]
– Value represented as a probability distribution
Quasar Group
Tasks of a Sensor Management
Framework
• Translation: mapping application quality requirement to data
quality requirements
– Examples:
• Target tracking: quality of track --> accuracy of data
• Aggregation Queries: accuracy of results --> accuracy of data
– Strategy should adapt to expected application load
• Collection
– Minimize sensor resource consumption while guaranteeing required
data quality
• Storage
• Dissemination/Delivery
Quasar Group
Middleware Components
Applications
mobile
target
tracking
activity
monitoring
location
based
service
....
Adaptive Middleware
Server Side Components
sensor
selection
adaptive
precision
setting
fault
tolerance
prediction
module
AQ DQtranslation
sensor
database
sensor data
management
Sensor Side Components
Sensor
State
management
prediction
module
precision
driven
adaptation
Distributed Sensor Environment
Quasar Group
Adaptive Tracking of mobile objects
Track visualization
object
Base station 1
Wireless link
Show me the approximate
track of the object with
precision 
Server
Wireless Sensor Grid
Base station 2
Base station 3
 Tracking Architecture
 A network of wireless acoustic sensors arranged as a grid transmitting via a
base station to server
 Objective
 Track a mobile object at the server such that the track deviates from the real
trajectory within a user defined error threshold track with minimum
communication overhead.
Quasar Group
Basic Triangulation Algorithm
P: source object power, Ii = intensity reading at ith
sensor
(x1, y1)
(x2, y2)
(x-x1)2 + (y- y1)2 = P/4 I1
(x-x2)2 + (y- y2)2 = P/4 I2
(x, y)
(x-x3)2 + (y- y3)2 = P/4 I3
(x3, y3)
Solving we get (x, y)=f(x1,x2,x3,y1,y2,y3, P,I1, I2 , I3, )
 More complex approaches to amalgamate more than three sensor
readings possible
 Those are based on numerical methods -- do not provide a closed form
equation between sensor reading and tracking location !
Server can use simple triangulation to convert track quality to sensor
intensity quality tolerances and use a more complex approach to track.
Quasar Group
Track quality  data quality
Case 1 (power constant)
 I1
Intensity
( I1 )
Let Ii be the intensity value of sensor
| Δ Ii | Ii ξ /(1  Iiξ )
If
then, track quality
is guaranteed to be within track
2
ti
time
t( i+1 )
2
/ C and C is a constant
where    track
derived from the known locations of the
sensors and the power of the object.
 I2
Intensity
( I2 )
ti
time
t( i+1 )
Case 2 (power varies between [Pmin , Pmax ])
 I3
 If
Intensity
( I3 )
time
ti
Y (m)
Quasar Group
then
t( i+1 )
 track
X (m)
2
Pmin 2  track
| I i | 2 [ I i
 I i Pmax ]
Pmax
C'
track quality is guaranteed to be within track
where C’ = C/ P2 and is a constant .
 The above constraint is a conservative
estimate. Better bounds possible
Components of an Information Collection
Framework
Information
Source
Information
Consumer
source
consumer
consumer request
source update request
……
Information
Mediator
source
consumer
……
DS
DS
DS
source
Quasar Group
Sensor Model
 Wireless sensors : battery operated, energy constrained
Removed from “active list”
S0: monitor
processor on,
sensor on, radio
off
S1: active
processor on,
sensor on, radio
on
Intensity above threshold
S2: quasi-active
processor on,
sensor on, radio
intermittent
Quasar Group
Data Collection Protocols
Sensor-Side protocol:
• When not in use:
– tell server to remove it from “active list”, switch to monitor mode S0
• Upon external event:
– if in S0, change to active mode S1, and update every time instant
– if in S2, update only when error bound violated
Server-Side protocol:
• If sensor state changes to S1
– add it to “active list”
– compute an error bound for it, and send to the sensor
• else, when value received, update server cache if the sensor is in “active
list”
Quasar Group
Data Collection Problem
Sensor time series
…p[n], p[n-1], …, p[1]
•
Let P = < p[1], p[2], …, p[n] > be a sequence of environmental measurements
(time series) generated by the producer, where n = now
•
Let S = <s[1], s[2], …, s[n]> be the server side representation of the sequence
•
A within- quality data collection protocol guarantees that
for all i
•
error(p[i], s[i]) < 
 is derived from application quality tolerance
Quasar Group
Answering Queries
query Q1
(A1)
sensor-initiated update
(sensor time series: …p[n], p[n-1], …, p[1])
query Qm
(Am)
…
probe
sensor si
•
Probe result
If query quality tolerance satisfied at server (more than )
– Answer query at the server
•
Else
– Probe the sensor
– Sensor guaranteed to respond within a bounded time 
•
Approach guarantees quality tolerance of queries
Quasar Group
i=[li,ui]
Imprecise data
representation
Simple Data Collection Protocol
Sensor time series
…p[n], p[n-1], …, p[1]
•
sensor Logic (at time step n)
Let p’ = last value sent to server
if error(p[n], p’) >  or on timeout 
send p[n] to server
•
--- sensor if switch radio on, if need be
server logic (at time step n)
If new update p[n] received at step n
s[n] = p[n]
Else
s[n] = last update sent by sensor
– guarantees maximum error at server less than equal to 
Quasar Group
Exploiting Prediction Models
•
Producer and server agree upon a prediction model (M, )
•
Let spred[i] be the predicted value at time i based on (M, )
•
sensor Logic (at time step n)
if error(p[n], spred[n] ) > 
send p[n] to server
•
server logic (at time step n)
•
If new update p[n] received at step n
s[n] = p[n]
Else
s[n] = spred[n] based on model (M, )
Quasar Group
Challenges in Prediction
• Simple versus complex models?
– Complex and more accurate models require more parameters (that
will need to be transmitted).
– Goal is to minimize cost not necessarily best prediction
• How is a model M generated?
– static -- one out of a fixed set of models
– dynamic -- dynamically learn a model from data
• When should a model M or parameters  be changed?
– immediately on model violation:
• too aggressive: violation may be a temporary phenomena
– never changed:
• too conservative: data rarely follows a single model
Quasar Group
Challenges in Prediction (cont.)
•
who updates the model?
– Server
• long-haul prediction models possible, since server maintains history
• might not predict recent behavior well since server does not know
exact S sequence; server has only samples
• extra communication to inform the producer
– Producer
• better knowledge of recent history
• long haul models not feasible since producer does not have history
• producers share computation load
– Both
• server looks for new models, sensor performs parameter fitting given
existing models.
Quasar Group
Experiment (error tolerance 20m)
 A restricted random motion : the object starts at (0,d) and moves from
one node to another randomly chosen node until it walks out of the grid.
 Models used: static and linear
Quasar Group
Energy Savings
 total energy consumption over all sensor nodes for random mobility
model with varying track or track error.
 significant energy savings using adaptive precision protocol over non
adaptive tracking ( constant line in graph)
for a random model, prediction does not work well !
Quasar Group
Energy Savings
total energy consumption over all sensor nodes for random mobility model
with varying base station distance from sensor grid.
 As base station moves away, one can expect energy consumption to
increase since transmission cost varies as d n ( n =2 )
better results with increasing base station distance
Quasar Group
Outline of this section
• Sensor network architectures
• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework
– Services that can be customized to address needs
• Case studies
– accuracy/cost tradeoffs in collection
– Accuracy/cost/timeliness tradeoffs in collection
– Storage/accuracy tradeoffs in archival
Quasar Group
Accuracy/Cost Tradeoff
• Applications can tolerate errors in sensor data
– applications may not require exact answers:
• small errors in location during tracking or error in answer to query result
may be OK
– data cannot be precise due to measurement errors, transmission
delays, etc.
• Cost
– Communication bandwidth
– Energy drain
• Quasar Approach
– exploit application error tolerance to reduce communication
between producer and server and/or to conserve energy
– Two approaches
• Minimize resource usage given quality constraints
• Maximize quality given resource constraints
Quasar Group
Modeling cost as communication
bandwidth (e.g.TRAPP)
– Caches store
approximations of
exact source values
• Queries have
precision constraints
Quasar Group
performance
• Goal: Minimize network usage while meeting
application-specific precision requirements
• Our solution:
stale cache
you decide
exact cache
precision
Modeling energy costs in sensors
• How should sensor state be managed to minimize energy
consumption in maintaining data at required quality
– Sensor State: error precision, power states
• Power consumption of sensors
Quasar Group
Sensor state
Radio mode
Power consumption (mW)
active
Tx
14.88
listening
Rx
12.50
listening
idle
12.36
sleeping
off
0.016
Energy Efficient Sensor State Management
Active-Listening-Sleeping Model (ALS):
sleeping
After Tl
without traffic
listening
Upon first sensor-initiated update
Or after Ts
Upon first sensor initiated
update or probe
active
Ta after processing last
sensor-initiated update or probe
Other Models: Always-Active (AA) [Ta is infinite]
Active-Listening (AL) [Tl is infinite]
Active-Sleeping (AS) [Tl is 0]
Quasar Group
Issues in Energy Efficient Data Collection
• Issues
– How to maintain the precision range  for each sensor
• Larger  increases possibility of expensive probes
• Small  wastes communication due to sensor-initiated updates
– When to transition between sensor states (I.e, set Ta, Tl, Ts)
• Powering down might not be optimal if we have to power up
immediately
• Powering down may increases query response time
• Objective
– set values for Ta, Tl, Ts, and  that minimizes energy cost
normalized energy cost= energy consumed at each state
Quasar Group
+ state transition energy
Addressing Accuracy/Energy Tradeoffs
• We solve the energy optimization problem by
solving two sub-problems
– Optimize energy consumption by adjusting range
size under the assumption that the state transition
is fixed
• I.e., Ta, Tl, and Ts have been optimally set
– Optimize energy consumption by adapting sensor
states while assuming that the precision range for
sensor is fixed
Quasar Group
Range size Adjustment for the AA/AL Model
•
Optimal precision range  that minimizes E occurs when
– Optimal range can be realized by maintaining this probability ratio
– Can be done at the sensor
•
Assuming that  is the ratio of sensor-initiated update probability to probe
probability:
for sensor-initiated update:
with probability min{,1}, set ’= (1+);
for probe:
with probability min{1/ ,1}, set ’=/(1+ );
Quasar Group
Range Size Adjustment for the AS/ALS Model
• Sensor side
– Keep track of the number of state transitions of the last k updates
– Piggyback the probability of state transitions with the Kth update
• Server side
– Keep track of the number of sensor-initiated updates and probes of
the last k updates
– Upon receiving the Kth update from the sensor
• Compute the optimal precision range 
• Inform the sensor about the new 
Quasar Group
Adaptive State Management
• Consider the AS model for derivation of optimal Ta to
minimize energy consumption
– Assuming (t) is the probability of receiving a request at time
instant t, the expected energy consumption for a single silent
period is
– E is minimized when Ta=0 if requests are uniformly distributed in
interval [0, Ta+Ts].
• In practice, learn (t) at runtime and select Ta adaptively
– Choose a window size w in advance
– Keep track of the last w silent period lengths and summarizes this
information in a histogram
– Periodically use the histogram to generate a new Ta
Quasar Group
Adaptive State Management (Cont.)
• ci : the number of silent periods for bin i among the last w silent
periods
• estimate  by the distribution which generates a silent period of
length ti with probability ci/w
• Ta is chosen to be the value tm that minimizes the energy
consumption as follows:
c1
c0
cn-1
c2
bin 0
t0
Quasar Group
bin 1
bin n-1
bin 2
t1
t2
t3
……
tn-1
tn=Ta+Ts
System Performance Comparison
Sensor Energy Consumption Comparison
800
16
normalized sensor energy
consumption(uJ)
average query respone time (us)
Query Response Time Comparison
700
600
500
400
300
200
100
0
14
12
10
8
6
4
2
0
AA
Quasar Group
AL
AS
ALS
AA
AL
AS
ALS
Impact of Ta adaptation on System Performance
840
820
800
780
760
740
720
700
Quasar Group
Impact of Ta Selection on Sensor Energy Consumption
normalized sensor energy
consumption(uJ)
average query response time(us)
Impact of Ta Selection on Query Response Time
static Ta(0)
adaptive Ta
9
8
7
6
5
4
3
2
1
0
static Ta(0)
adaptive Ta
Impact of Range Size Adaptation on System Performance
Impact of Range Size Adjustment
on Query Response Time
normalized sensor
energy consumption(uJ)
Impact of Range Size Adjustment
on Sensor Energy Consumption
average query
response time (ms)
2500
2000
1500
1000
500
0
fixed(0)
Quasar Group
average accuracy
constraint
adaptive
adjustment
fixed(large)
0.05
0.04
0.03
0.02
0.01
0
fixed(0)
average accuracy
constraint
adaptive
adjustment
fixed(large)
Outline of this section
• Sensor network architectures
• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework
– Services that can be customized to address needs
• Case studies
– Accuracy/cost tradeoffs in collection
– Accuracy/cost/timeliness tradeoffs in collection
– Storage/accuracy tradeoffs in archival
Quasar Group
Accuracy/Cost/Timeliness Tradeoffs
• Continuous stream of fast changing source data
• Diverse user requirements in terms of data accuracy and
service timeliness
• Effective utilization of underlying computation, communication
and storage resources
Competing goals of
Timeliness
Accuracy
Cost-effectiveness
Quasar Group
Real-time Communication for
sensors
• John A. Stankovic, Tarek Abdelzaher, Chenyang
Lu, Lui Sha, Jennifer Hou, "Real-Time
Communication and Coordination in Embedded
Sensor Networks," Proceedings of the IEEE,
91(7): 1002-1022, July 2003. (invited paper)
• SPEED: a stateless protocol (ICDCS’03)
• RAP (RTAS’02)
Quasar Group
Real-time Data Processing
• Supporting transaction timeliness and data
freshness in databases
– STRIP (STanford Real-time Information
Processor)
– ARCS (databases for Active Rapidly Changing
data Systems)
– QMF (QoS sensitive approach for Miss Ratio and
Freshness guarantees)
Quasar Group
Modeling Application Timeliness Needs
source value
PREC (U , L) 
L
1
U L
U
timeliness requirements
( source ID, request issue time, periodicity, urgency, relative deadline )
+
+
current value
(accuracy requirement, bias)
source update request
consumer request
no preference
0

bias  1 favoring timelines s
2 favoring accuracy

Quasar Group
QoS as a metric of user satisfaction
timeliness satisfaction = deadline is met:
QoS
TT  RDL
accuracy satisfaction = answer precision requirement is higher : PRECanswer  PREC req
& answer fidelity is 1 :
1 L  V (cr.s, cr.t )  U
Fidelity ( A, cr )  
otherwise
0


cr
RDL cr j
1, TTcr j 
timeliness
cr j 
crjj Bias
Bias
timeliness
cr j  1, TT
cr j  RDL
cr j
timeliness
QoS
(QoS
for
requests
favoring
timelines
s)

w

QoS
satisfaction
QoS  w111 
satisfaction
satisfaction
cr
 1
crjj || Bias
Bias cr
cr jj  1



cr
cr Bias
Bias 
 22,, Fidelity
Fidelity((A
A )) 
11,, PREC
PREC
w
w (QoS for requests favoring accuracy)
j
j
cr j
cr j
cr j
cr j


ww2222 (QoS for requests favoring accuracy)
cr
crjj || Bias
Biascrcrjj 
 22

Acr j
Acr j
 PREC
PRECcrcrjj

accuracy
accuracy
satisfaction
satisfaction

cr j Bias cr j  0, TTcr j  RDL cr j , PREC Acr j  PRECcr j , Fidelity ( Acr j )  1

w

(QoS
for
3

w

(QoS
for
requests
without
bias)
 3w
forrequests
requestswithout
withoutbias)
bias)
w
33  (QoS
crj | Bias cr j  0
Quasar Group


timeliness
& accuracy
satisfaction
Quality of Data Characterization
DS Fidelity(DS vs. source value):
fidelity of s at
time instant t
prob. of
accessing a
faithful s
value during
T
aggregate
DS fidelity
1 if L  v  U
FI ds( s, t )  
0 otherwise
DS Validity(DS vs. consumer needs):
1 if PREC ( L,U )  PRECcri
VAds (cri ( s, t ))  
otherwise
0
k
VAds ( s, t ) 
VA
i 1
 FI ( s, [ti , t j ]) 
1
 FI ds ( s, t )dt
T ti
  paccess ( si )  p fi ( s, T )
AFI ds ( S , T )
si S
  paccess ( si )  FI ds ( si , T )
pva (s, T )  VAds
aggregate
DS validity
 
(s, T )  VA(s, [t , t ]) 
i
j
ks
tj
u 1
v t i
VAds (cru (s, tv ))
ks
  paccess ( si )  pva ( s, T )
AVAds ( S , T )
si S
Overall QoD:
QoD  AFI ds ( S , T )  AVAds ( S , T )
Quasar Group
(cri ( s, t ))
k
p fi ( s, T )  FI ds ( s, T )
tj
ds
si S
  paccess ( si )  VAds ( si , T )
si S
Objectives of real-time data collection
• Given a set of sources S={s1,…,sl} and an Input
instance I , which is a collection of m source update
requests and n consumer requests
I=SRCR={sr1,…,srm;cr1,…,crn}, our goal is to
– Maximize QoS
– Maximize QoD
– Minimize Cost
Quasar Group
Joint optimization of QoS, QoD and Cost
• Dynamicity
– Highly dynamic system and network condition
– Unpredictable application workload
– Frequently changing information sources
• Inter-relationship between QoS and QoD is not
? QoS
straightforward: QoD 
– Prioritize source update requests
•  QoD   deadline miss ratio  QoS & missing
opportunities
– Prioritize consumer requests
•  QoS  stale data   QoD & making wrong decisions
Quasar Group
One approach
• Frame the tradeoffs as two sub-problems
– Manipulate QoS via a scheduling algorithm, assuming
DS is well maintained (QoD)
– Adjust QoD via a DS maintenance algorithm, assuming
an efficient scheduling algorithm is applied (QoS)
Quasar Group
Design of the Information Mediator
……
Information
Mediator
Information
Consumer
request
consumer
scheduler
consumer request queue
source update request
source
consumer-initiated source update request
consumer request or
source update request
……
source update request queue
request
servicer
feedback
consumer-initiated probe
source
check value
stored range
DS
……
……
answer
Information
Source
consumer
DS
maintainer
Quasar Group
update
source
probe
Design of the Scheduling Algorithm
• Issues
– Decide on an ordering of the incoming source update
requests
•
The most recent update will be processed first
– Decide on a relative ordering of source update and
consumer requests
Quasar Group
Scheduling Strategies
• CF (Consumer request First)
• SF (Source update request First)
• SU (Split Update)
– Updates from popular data are assigned higher
priority than consumer requests
• OD (On-Demand Update)
– Only when consumer requests encounter stale
data, will the corresponding source update
requests be applied
Quasar Group
Timeliness-Accuracy Balanced Scheduling
(TABS)
 Assignment absolute deadline
ADL=t+PER
periodic
requests:
Processor
utilization
PER
t
UP  
time
RDL
np
i 1
Ei
min{ RDL i , PERi )
ADL=t+RDL
request i
aperiodic
requests:
U AP 1  U P
ADLi-1
t
RDL
ADLi=max(t, ADLi-1)+Ei/UAP
time
 Apply Earliest-Deadline-First
 TABS schedulability
Given a set of np periodic requests with processor utilization UP , a TB server with
processor utilization UAP , the whole set of task is schedulable if UP+UAP<=1.
Quasar Group
Minimized Cost Directory Service Maintenance (MC)
• Analyze cost involved in the collection process
• Range adjustment
– Consumer-initiated update: shrink the range
– Source-initiated update: curve fitting
mw > mw-1: increase range size
source
value
mw < mw-1: decrease range size
fitted
curve
slope:
mw-1
time
w-1
w
monitoring window
Quasar Group
Experiments
• Performance metrics
– QoS, QoD, Cost (the number of messages exchanged)
– Efficiency of System EoS (QoS QoD/Cost)
• Experiments
– Evaluation of all the possible policy combination in terms of the
overall EoS
– Evaluation of system heterogeneity in terms of source
capabilities and deadline variations
– Evaluation of benefits by adding intelligence into each subcomponent of the mediator
Quasar Group
Benefits of Intelligent Policies
0.18
0.16
EoS
0.14
TABS+MC
0.12
0.1
TABS+SS
0.08
FCFS+SS
0.06
0.04
25
50
100
150
the num ber of sources
The EoS is improved as more intelligence is added to each component
•
TABS ensure fairness among the requests
•
MC decreases the DS maintenance overhead
Quasar Group
Fusing Energy Efficient Data Collection
and In-network Aggregation
access point
……
access point
…
• Issues
– Hierarchical precision range  adjustment
– Cluster forming and dynamic maintenance
Quasar Group
…
Value update -- 1
AP
AP
{212 -10, 212+10}
C1:
{200 -20, 200+20}
C1:
{212 -20, 212+20}
112
n1:
{100 -10, 100+10}
Quasar Group
(a)
n2:
{100 -10, 100+10}
n1:
{112 -10, 112+10}
(b)
n2:
{100 -10, 100+10}
Value update -- 2
AP
AP
224
C1:
{224 -20, 224+20}
C1:
{200 -20, 200+20}
112
112
n1:
{112 -10, 112+10}
n2:
{112 -10, 112+10}
(c)
Quasar Group
112
113.7
86.3
n1:
85
n2:
{113.7 -10, 113.7+10}
{86.3 -10, 86.3+10}
(d)
Error Adjustment
• When?
– (fmax - fmin)/fmax >= rth
• How?
– dfmax = a* dfmax +(1-a)*(dfmax + dfmin)*(fmax /(fmax + fmin))
– dfmin = a* dfmin +(1-a)*(dfmax + dfmin)*(fmin /(fmax + fmin))
Quasar Group
Fault Tolerance Issues
• Communication
– Routing
• SPIN: disseminate data to all the sensors
• Braided Diffusion: maintain multiple braided paths as backup
• GRAB (Gradient Broadcast): controlled mesh forwarding
– Transport protocol
• PSFQ (pump slowly, fetch quickly): store-and-forward, multihop forwarding
• ESRT (event to sink reliable transmission): adjust source
reporting frequency to avoid congestion and maintain enough
reliability
• RMST (reliable multi-segment transport): MAC layer
• Storage
– R-DCS (Resilient Data Centric Storage): store event data at
the closest R replica nodes
Quasar Group
Outline of this section
• Sensor network architectures
• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework
– Services that can be customized to address needs
• Case studies
– Accuracy/cost tradeoffs in collection
– Timeliness/accuracy/cost tradeoffs in collection
– Storage/accuracy tradeoffs in archival
Quasar Group
Archiving Sensor Data
• Often sensor-based applications are built with only the real-time
utility of time series data.
– Values at time instants <<n are discarded.
• Archiving such data consists of maintaining the entire S
sequence, or an approximation thereof.
• Importance of archiving:
– Discovering large-scale patterns
– Once-only phenomena, e.g., earthquakes
– Discovering “events” detected post facto by “rewinding” the time
series
– Future usage of data which may be not known while it is being
collected
Quasar Group
Quality Sensitive Archival
• Let P = < p[1], p[2], …, p[n] > be the sensor time series
• Let S = < s[1], s[2], …, s[n] > be the server side representation
• A within archive quality data archival protocol guarantees that
error(p[i], s[i]) < archive
• Trivial Solution: modify collection protocol to collect data at
quality guarantee of min(archive , collect)
– then data collection protocol described earlier will provide a archive
quality data stream that can be archived.
• Better solutions possible since
– archived data not needed for immediate access by real-time or
forecasting applications (such as monitoring, tracking)
– compression can be used to reduce data transfer
Quasar Group
Addressing Cost/Quality Tradeoffs in
Data Archival – Sample Protocol
Sensor updates for
data collection
…p[n], p[n-1],
..
Compressed representation
for archiving
compress
Sensor memory buffer
processing at sensor exploited to reduce
communication cost and hence battery drain
• Sensors compresses observed time series p[1:n] and sends a
lossy compression to the server
• At time n :
– p[1:n-nlag] is at the server in compressed form s’ [1:n-nlag] withinarchive
– s[n-nlag+1:n] is estimated via a predictive model (M, )
• collection protocol guarantees that this remains within- collect
– s[n+1:] can be predicted but its quality is not guaranteed
• it is in the future and thus the sensor has not observed these values
Quasar Group
Piecewise Constant Approximation (PCA)
• Given a time series Sn = s[1:n] a piecewise constant approximation
of it is a sequence
PCA(Sn) = < (ci, ei) >
that allows us to estimate s[j] as:
scapt [j] = ci
if j in [ei-1+1, ei]
= c1
if j<e1
Value
c1
c3
c2
e1
Quasar Group
e2
c4
Time
e3
e4
Online Compression using PCA
•
Goal: Given stream of sensor values, generate a within-archive PCA
representation of a time series
•
Approach (PMC-midrange)
– Maintain m, M as the minimum/maximum values of observed samples
since last segment
– On processing p[n], update m and M if needed
• if M - m > 2archive , output a segment ((m+M )/2, n)
6
Value
Example: archive =
1.5
4
3
2.5
2
Time
1
Quasar Group
2
3
4
5
Online Compression using PCA
• PMC-MR …
– guarantees that each segment compresses the corresponding
time series segment to within-archive
– requires O(1) storage
– is instance optimal
• no other PCA representation with fewer segments can meet the
within-archive constraint
• Variant of PMC-MR
– PMC-MEAN, which takes the mean of the samples seen thus far
instead of mid range.
Quasar Group
Improving PMC using Prediction
• Observation
– Prediction models guarantee a within- collect version of the time series
at server even before the compressed time series arrives from the
producer.
• Can the prediction model be exploited to reduce the overhead of
compression.
– If archive> collect no additional effort is required for archival --> simply
archive the predicted model.
• Approach:
– Define an error time series E[i] = p[i]-spred[i]
– Compress E[1:n] to within-archive instead of compressing p[1:n]
– The archive contains the prediction parameters and the compressed
error time series
– Within-archive of E[I] + (M,
Quasar Group
archive version of p
) can be used to reconstruct a within-
Combing Compression and Prediction (Example)
25
30
25
Predicted Time
Series
20
15
20
Compressed Time
Series
15
(7 segments)
Actual Time
Series
10
Actual Time
Series
10
5
5
0
0
-5
0
0
10
20
30
40
50
60
Actual – Predicted
0.5
0
-0.5
-1
-1.5
Compressed Error
-2.5
-3
-3.5
-4
Quasar Group
20
Error =
1
-2
10
-5
(2 segments)
30
40
50
60
Estimating Time Series Values
• Historical samples (before n-nlag) is maintained at the server withinarchive
• Recent samples (between n-nlag+1 and n) is maintained by the
sensor and predicted at the server.
• If an application requires q precision, then:
– if q  collect then it must wait for  time in case a parameter refresh is en
route
– if q  archive but q < collect then it may probe the sensor or wait for a
compressed segment
– Otherwise only probing meets precision
• For future samples (after n) immediate probing not available as an
option
Quasar Group
Distributed Computing Infrastructure for Sensors
• Designing Distributed Architectures for Sensor
Networks
– Server oriented -- data migrates to server from sensors
• Store or not store (stream)
• Useful for all types of applications -- archival, analysis,
monitoring
• When should data migrate -- periodically, application qualitybased way based on application (quasar approach )
• should data migrate in its original raw form or in some
aggregated form.
– Distributed approach
• Data does not migrate to any single server but remains in the
sensor network. Queries migrate from the server to the network
• Tiny DB approach, dimension Approach
• Real-time
• Fault tolerance
Quasar Group
Part 3: Query Processing in Sensor
Applications
Quasar Group
Outline
• Need for a declarative query language for
sensor applications
• Query Taxonomy
• Issues impacting sensor query processing
– Sensor database research landscape
• Sample query Processing techniques
Quasar Group
Programming Sensor Nets Is Hard
• Applications must be “energy aware”
– Naive implementations may result in battery drain in days
while careful programming may conserve power for months
• interleave sleep with processing and transmission
– Recharging battery frequently not feasible
•
Lossy, multi-hop, low-bandwidth, short range
communication
High-Level Abstraction Is
– 20% loss @ 5m
Needed!for communication
– often desirable to trade computation
– 200-800 instructions per bit transmitted!!
– applications must be “network aware”
• Highly distributed environments
• Once deployed, applications cannot be easily
administered
• Limited development and debugging tools
Quasar Group
Declarative Queries
• Users specify the data they want
– Simple, SQL-like queries
– Using predicates, not specific addresses
• Challenge is to provide:
– Expressive & easy-to-use interface
– High-level operators
• Well-defined interactions
• “Transparent Optimizations” that many programmers would miss
– Sensor-net specific techniques
– Power efficient execution framework
Quasar Group
Database View of Sensor Data
time
• Sensors viewed as a single
table
– Columns are sensor data
– Rows are individual sensors
• Sensors table is an
unbounded, continuous data
stream
– Operations such as sort and
symmetric join are not
allowed on streams
– They are allowed on
bounded subsets of the
stream (windows)
• SQL (with minor extensions)
can be used as a declarative
query language
Quasar Group
Nodeid
Location
value
0
1
17
455
0
2
25
389
1
1
17
422
1
2
25
405
SELECT nodeid, nestNo, light
FROM sensors
WHERE light > 400
“Find the sensors in
bright nests.”
Taxonomy of Queries
• Query Generality
– Simple selection, aggregation, full-blown SQL
• Continuous queries
– query evaluated continuously on sensor data streams
– Issues:
• How long
– For a specified period, for lifetime of sensor
• how often
– adaptive rate (based on load/utility/value), fixed rate
• Event based queries
Quasar Group
Aggregation Queries
2 SELECT AVG(sound)
FROM sensors
EPOCH DURATION 10s
“Count the number occupied
nests in each loud region of
the island.”
Epoch
3 SELECT region, CNT(occupied)
region
CNT(…)
AVG(…)
0
North
3
360
FROM sensors
0
South
3
520
GROUP BY region
1
North
3
370
HAVING AVG(sound) > 200
1
South
3
520
AVG(sound)
EPOCH DURATION 10s
Quasar Group
Regions w/ AVG(sound) > 200
General SQL Query
General: Is there anyone in the building?

Value>10dB

Value>10lm
Join
RoomID = RoomID
SELECT roomid
FROM lightsensors as L,
soundsensors as S
WHERE L.roomid = S.roomid
Quasar Group
Event-Based Queries
• An alternative to continuous polling for data
• Example
ON EVENT bird-detector(loc):
SELECT AVG(light), AVG(temp), event.loc
FROM sensors AS s
WHERE dist(s.loc, event.loc) < 10m
SAMPLE INTERVAL 2s FOR 30s
Quasar Group
Lifetime Queries
• Lifetime query
SELECT …
LIFETIME 30 days
SELECT …
LIFETIME 10 days
Estimate sampling
rate that achieves this
May not be able to
transmit all the data
MIN SAMPLE INTERVAL 1s
Quasar Group
Adapted from slides ©Sam Madden
Processing Lifetimes: Issues
• Provide formulas for estimating power consumption:
set maximum per-node sampling rates
• What makes this difficult?
– multiple sensing types (temp, accel) with different drain
– estimating the selectivity of predicates
– • amount transmitted by a node varies widely
– root is a bottleneck: all nodes rates must correspond to it
– aggregation vs. sending individual values
– conditions change: multiple queries, burstiness, message
losses
•
What to do when can’t transmit all the data
Quasar Group
Adapted from slides ©Sam Madden
Issues impacting Query Processing
• Where Does data resides?
– sensor/server
• Where does the query originate?
– sensor/server
• Where should the results be delivered?
– sensor/server
• How is data represented?
– Continuous data streams require unbounded storage
• Represent data as a synopses (spatial/temporal aggregation)
– Sliding Windows, Samples, Sketches, Histograms, Wavelet
representation
– Precise / approximate representation
• with or without error guarantees
• guarantees can be deterministic or probabilistic
Quasar Group
Sensor Database Research
Landscape
Type of query
•Aggregation
•selection
•General SQL
•continuous
•Event-based
Query Evaluation
•At server
•In network
•At both server and network
Quasar Group
Data representation
•precise representation
•Approximate value
•Specified spatial/temporal
resolution
Data & Query
Location
•server
•Sensor
network
Classification of Query Processing
Techniques (1)
• Data and query @ server
– Data Stream Model
• Data streams from data sources to servers
• server maintains a synopses
• continuous queries at server
Quasar Group
Stream Data Management
synopsis in memory
data streams
•
•
sliding window, Sketches, histograms, wavelets, sampling
Deals with problems due to dynamic data on the server side
But
–
–
–
•
at input: sampling
at server: if load exceeds capacity
Continuous queries evaluated against streaming data at sensor
Data represented as a synopses
–
•
•
(Approximate)
Answer
Data streams through the server
Load shedding
–
–
•
•
stream
processing
engine
continuous queries
Does not converse sensor resources (e.g., power)
Does not exploit the storage and processing capabilities of sensors
Geared towards continuous monitoring and not archival applications
Examples:Aurora (Brown/MIT), Streams (Stanford), Hancock (AT&T), OpenCQ
(Georgia) Tapestry (Xerox), Telegraph (Berkeley), ...
Quasar Group
Classification of Query Processing
Techniques (1)
• Data and query @ server
– Data Stream Model
•
•
•
•
Data streams from data sources to servers
server maintains a synopses
continuous queries at server
Examples:Aurora (Brown/MIT), Streams (Stanford), Hancock (AT&T),
OpenCQ (Georgia) Tapestry (Xerox), Telegraph (Berkeley), …
– Quality-Aware Query answering
• quality aware data collection at the server
– attempts to minimize communication/energy consumption in network during
data collection
• Applications/ Queries have quality tolerance
– query tolerance converted to data quality requirement
• If query’s error tolerance met by data at server, query computed @
server
• Else, either more accurate data brought to server, or servers and
sensors collaborate to answer query
• Error tolerance of applications exploited for minimizing resource
utilization
• Examples: Quasar (UCI), TRAP (Stanford).
– Quasar exploits in-network processing when query cannot be answered at
server
Quasar Group
Classification of Query Processing
Techniques (2)
• In network query processing
– Query originates and results needed at base station
• Two steps:
– Push query to sensor network
– gather results
• Trades computation to reduce communication among sensors.
• Examples: TinyDB (Berkeley), Cougar (Cornell)
– Query originates and results required anywhere in network
• Distributed query processing within sensor network
• Example: SURGE (UCI), research @ UCLA
Quasar Group
Quality Aware Queries (QaQ)
query Q1
(A1)
sensor-initiated update
(sensor time series: …p[n], p[n-1], …, p[1])
query Qm
(Am)
…
probe
sensor si
•
i=[li,ui]
Probe result
Data represented at server at a given error tolerance
–
Actual sensor values: Pi = pi[1], pi[2], …, pi[n]…. for sensor i
–
Server representation: Si = si[1], si[2], … si[n] …. for sensor I
–
Error guarantee:
for all I, j
error(pi[j], si[j]) < i for a given value of i
•
Queries have an associated level of error tolerance.
•
If query quality tolerance satisfied at server (more than )
–
•
•
Answer query at the server
Else
–
Probe the sensor
–
Sensor guaranteed to respond within a bounded time 
Approach guarantees quality tolerance of queries
Quasar Group
Imprecise data
representation
Overview of QaQ Processing
Research
•
Mapping application quality requirement to data quality requirements
–
–
•
Quality-based data collection
–
–
–
–
–
•
Target Tracking using acoustic sensors [MW ‘03]
Spatial range queries [DEXA ‘03]
General framework [DS Online ‘03]
To support monitoring queries over current data [Qi+03]
For sensor data archival [ICDE ‘03]
With real-time constraints [RTSS ‘03]
With support for in-network aggregation [Yu+03]
Quality-cognizant query processing
–
–
–
–
Quasar Group
Aggregation queries [Quasar-1, Trap-1, Trap-2]
Continuous aggregation queries [Trap-3]
Selection Queries [ICDE ‘04]
General SQL queries (open problem)
QaQ Selection: Problem Definition
• There is a collection T of imprecise objects
– E.g., { [1,3], [2,5], [4,9] } represents {2, 3, 5}
• The query is: “Retrieve objects from T which satisfy
predicate ”
– The query specifies quality requirements
– The system must return some approximate result that
meets the quality requirements and with minimum overall
cost.
Quasar Group
Impact of Data Imprecision
Selection 
b
a
c
d
e
f
Imprecise Object o
• Objects are classified as:
– a is a NO object
– b, f are MAYBE objects
– c, d, e are YES objects
• The exact set is E = { b,  c,  d,  e}
Quasar Group
Precise Object  o can
be retrieved with a
probe
Defining Quality
Selection 
a
b
c
d
e
• Measures the accuracy of an Approximate answer A
• Set-based Quality
– Precision: p = |A  E | / | A |.
• E.g., p = 4/5 (if b, c, d, e, f returned as answers)
– Recall: r = | A  E | / | E |.
• E.g., r = 4/4 = 1 (if b, c, d, e, f returned as answers)
• Value-based Quality
– Laxity of an object is l (o ). E.g., l ([2,3]) = 3-2=1
– Laxity of A is l max = max xA l (x)
• Query specifies upper bounds pq, rq, lmaxq
Quasar Group
f
Evaluating QaQ Selection Operator
Read Object
MAYBE
YES
NO
• Probe
• Forward
• Ignore
• Probe
• Forward
• Ignore
•Another possibility is to store the object and deal with it later
•Might be good under certain situations based on available memory at the
server
Quasar Group
The Decision Problem
• How should the QaQ selection operator
decide
– When to probe
– When to forward
– When to ignore
• Objective:
– Meet query quality requirement
– Minimize cost
Quasar Group
Constraints on the Decision
• Some decisions are fixed -- we have no choice!
• No objects with l(o) greater than the query tolerance lqmax must
be forwarded
• The precision guarantee pG must never be less than the query
tolerance pq
– If no new YES objects are seen might lead to pq violation
• If |A  Y | / (|Y |+|Ms-A|) is less than the query
tolerance rq you can’t ignore an object
– This might lead to an rq violation if no new YES objects are
seen
Quasar Group
Two Naïve Approaches
•
Two simple heuristics:
– STINGY avoids probes: it ignores MAYBE objects and
objects exceeding the lqmax threshold.
• STINGY is conservative, but sometimes it is forced to probe to
meet the quality guarantees.
– GREEDY forwards all MAYBE objects and probes all objects
that exceed the lqmax threshold.
• GREEDY tries to produce the result quickly by not ignoring
objects, but sometimes it uses too many probes and forwards
too many objects
Quasar Group
Impact of Probe, Forward, Ignore
actions to quality
• + increase, - decrease, = remains the same
Quasar Group
The “decision” Plane (ICDE 2004)
No
Maybe
Laxity l(o)
1
2
Yes
3
6
or ignore
Probe
Ignore
Probe with
probability ppy
s5
4
s3
Forward with
probability pfm
lqmax
5
7
Probe
Forward
or ignore
s(o)=0
0<s(o)<1
s(o)=1
s(o): probability of a
MAYBE object satisfying
the selection
Quasar Group
The Optimization Problem
• Free parameters ppy, s3, s5 , pfm
• Estimate:
– Number of YES/MAYBE/NO objects
– Number of YES/MAYBE objects exceeding the
lqmax threshold
– Distribution of s (o )
• Minimize cost W in parameter space (ppy, s3 ,
s5 , pfm) subject to Precision, Recall, Laxity
guarantees
Quasar Group
Query Aware Query Processing
(Review)
• Quality aware data collection
• Queries have error tolerance
• QaQ query processing optimizes resource consumption while
ensuring query quality requirement.
• A Dual problem:
– optimize quality given resource constraints
• Aurora Stream Processing system explores such an approach
Quasar Group
AURORA in the Sensor Database
Landscape
Data representation
•time sampled
Type of query
•continuous
Query Evaluation
•At server
Quasar Group
Quasar Group
Data & Query
Location
•server
Aurora System Model
•
Input Streams are unpredictable
–
•
The Output Streams must be useful to applications.
–
•
Specified by Quality of Service (QoS)
The Goal: shed load intelligently so that
–
–
Quasar Group
If system processing capacity is reached load must be dropped by invoking the
Load Shedder
system operates within processing capacity
QoS of output streams maximized
Quality of Service
Types of QoS
Latency
Shows utility drop as answers
take longer to achieve
(Handled by Scheduler)
Value-based
Value-based QoS
utility
1.0
0.4
Shows which output values are
most important (Handled by
Load Shedder)
Loss-tolerance
Shows how approximate
answers affect a query
(Handled by Load Shedder)
Quasar Group
0
80
120 200
values
Loss-tolerance QoS
utility
1.0
0.7
100
50
0
% delivery
Key Questions
how is load measured?
Via static load coefficients and dynamic monitoring of stream rates
when to shed load?
When processing capacity does not suffice for handling the system
load
where to shed load?
In which segments of the query processing graph?
how much load to shed?
What fraction of tuples will be discarded?
which tuples to drop?
Do tuple values affect the decision of whether to drop them or not?
Quasar Group
How to Measure Load: Load Coefficients
I
c1
s1
c2
s2
…
cn
sn
O
Load Coefficients (L)
the number of processor cycles required to push a single tuple
through the network to the outputs
• n operators
• ci = cost
• si = selectivity
Total Load (Load)
Depends on load coefficients Li
and input stream rates
Quasar Group
• m input streams
• ri = stream rate
Load =
Load Coefficient (Example)
L2 = 14
2
c2 = 10
s2 = 0.8
I
L(I) = 22
L1 = 22
1
c1 = 10
s1 = 0.5
L3 = 5
3
cn = 5
sn = 1.0
L4 = 10
4
c2 = 10
s2 = 0.9
O1
O2
L1 = 10 + (0.5 * 10) + (0.5 * 0.8 * 5) + (0.5 * 10) = 22
L2 = 10 + (0.8 * 5) = 14
Quasar Group
When to Shed Load
N: network
I: input streams
C: processing capacity
Shed load when:
Load(N(I)) > C
Quasar Group
How to Shed Load: Drop Tuples
Modify N into N’ by inserting “drop” operators, such that:
Load(N’(I)) < H * C
U
σ


π
π
σ
Random Drop
Quasar Group
σ
QoS
QoS
Semantic Drop
Drop
k%
Filter
P(value)
Drop tuples randomly
Drop tuples based on
the utility of their value
Where to Shed Load
2
1
3
Usually at the inputs, but
Placing a drop in 1 relieves all three operators
QoS of both output streams is affected
Quasar Group
Random Drops
Greedy approach:
Order drop locations in ascending Loss/Gain ratios
Insert drops in location with the minimum Loss/Gain ratio first; repeat
until enough capacity has been retrieved
The amount of the drop is in increments of STEP_SIZE
The drop operator has a cost: inserting a drop for <STEP_SIZE does not
retrieve any processing capacity!
Quasar Group
Semantic Drops
Greedy approach:
Each value interval has a frequency fi and a utility ui
Start dropping from the interval with minimum ui
First drop from interval with utility 0.2 and relative frequency 0.4
You can drop at most 40% of the tuples using the first interval
If this suffices, drop as many as needed
Else, choose the interval with next minimum ui
Quasar Group
In network Query Processing
•
Two steps:
– Query Dissemination
• Exploit broadcast based routing to disseminate query to sensors
– Query execution and Result accumulation
• Gather and compute results in network en-route to the root (base station)
•
Plusses
– In network computation reduces periodic communication of raw results.
– Trades computation for communication – a very worthwhile goal for sensor
nets
• 1 bit communication approx. equivalent to 800 instructions!
•
Minuses
– Query dissemination and execution synchronization overheads.
• Benefit must exceed cost!
– Applicable only when sensor data does not need to be archived.
– Scalability to really large networks not studied.
•
Examples
– TinyDB (Berkeley)
• TAG – in-network aggregation
• AQP – in network SQL
– SURGE (UCI)
• distributed in-network aggregation
Quasar Group
Query Propagation in TAG
Broadcast based communication
SELECT
COUNT(*)…
Comm. Slot
1
Epoch
2
3
4
5
Quasar Group
Basic Aggregation
• In each epoch:
– Each node samples local sensors once
– Generates partial state record (PSR)
• local readings
• readings from children
– Outputs PSR during its comm. slot.
• At end of epoch, PSR for whole network output
at root
• Many optimizations possible
– grouping, pipelining
Quasar Group
1
2
3
4
5
Illustration: Aggregation
SELECT COUNT(*)
FROM sensors
Sensor #
1
1
2
3
Slot 1
1
4
5
1
2
3
Slot #
2
3
4
1
Quasar Group
4
1
5
Illustration: Aggregation
SELECT COUNT(*)
FROM sensors
Sensor #
1
2
3
Slot #
3
1
4
1
2
Slot 2
5
1
2
3
2
2
4
4
1
Quasar Group
5
Illustration: Aggregation
SELECT COUNT(*)
FROM sensors
Sensor #
1
2
3
1
5
1
2
Slot #
1
4
1
3
Slot 3
3
2
3
2
1
3
4
4
1
Quasar Group
5
Illustration: Aggregation
SELECT COUNT(*)
FROM sensors
Sensor #
1
2
3
Slot #
Quasar Group
2
3
2
3
1
5
1
2
Slot 4
1
4
1
4
5
1
3
4
5
5
Illustration: Aggregation
SELECT COUNT(*)
FROM sensors
Sensor #
1
2
3
Slot #
2
3
2
3
Quasar Group
5
1
2
1
1
4
1
4
Slot 1
1
3
4
5
1
1
5
Aggregation Framework
• As in extensible databases, TAG support any
aggregation function conforming to:
Aggn={finit, fmerge, fevaluate}
finit{a0}
 <a0>
Partial State Record (PSR)
Fmerge{<a1>,<a2>}  <a12>
Fevaluate{<a1>}
 aggregate value
(Merge associative, commutative!)
Example: Average
AVGinit
{v}
 <v,1>
AVGmerge {<S1, C1>, <S2, C2>}
 < S1 + S2 , C1 + C2>
AVGevaluate{<S, C>}
 S/C
Quasar Group
Types of Aggregates
• SQL supports MIN, MAX, SUM, COUNT, AVERAGE
• Any function can be computed via TAG
• In network benefit for many operations
– E.g. Standard deviation, top/bottom N, spatial
union/intersection, histograms, etc.
– Compactness of PSR
Quasar Group
Taxonomy of Aggregates
• TAG insight: classify aggregates according to various
functional properties
– Yields a general set of optimizations that can automatically be applied
Property
Examples
Affects
Partial State
MEDIAN : unbounded,
MAX : 1 record
Effectiveness of TAG
Duplicate Sensitivity
MIN : dup. insensitive,
AVG : dup. sensitive
Routing Redundancy
Exemplary vs.
Summary
MAX : exemplary
COUNT: summary
Applicability of Sampling, Effect of
Loss
Monotonic
COUNT : monotonic
AVG : non-monotonic
Hypothesis Testing, Snooping
Quasar Group
TAG Advantages
• Communication Reduction
– Important for power and contention
• Continuous stream of results
– Smooth transient faults across epochs
• Lots of optimizations
– Via operator semantics
Quasar Group
Simulation Environment
• Evaluated via simulation
• Coarse grained event based simulator
– Sensors arranged on a grid
– Two communication models
• Lossless: All neighbors hear all messages
• Lossy: Messages lost with probability that increases with
distance
Quasar Group
Benefit of In-Network Processing
Simulation Results
Total Bytes Xmitted vs. Aggregation Function
2500 Nodes
50x50 Grid
100000
Neighbors = ~20
Total Bytes Xmitted
Depth = ~10
90000
80000
Some aggregates
require dramatically
more state!
70000
60000
50000
40000
30000
20000
10000
0
EXTERNAL
Quasar Group
MAX
AVERAGE
Aggregation Function
COUNT
MEDIAN
Processing in Network SQL
Processing (Berkeley)
• Query Disseminated to sensors
• Results gathered en-route to the root (base station)
• Issues:
– How should the query be processed?
• Sampling as an operator, Power-optimal ordering
• Frequent events as joins
– Which nodes have relevant data?
• Semantic Routing Tree for effective pruning
– Nodes that are queried together route together
– Which samples should be transmitted?
• Pick most “valuable”?
• Adaptive transmission & sampling rates
Quasar Group
Power-Optimal Operator Ordering:
Interleave Sampling + Selection
SELECT light, mag FROM sensors
WHERE pred1(mag) AND pred2(light)
SAMPLE INTERVAL 1s
• Energy cost of sampling mag >> cost of sampling light
1500 uJ vs. 90 uJ
• Correct ordering (unless pred1 is very selective):
1.
Sample light
Sample mag
Apply pred1
Apply pred2
2. Sample light
Apply pred2
Sample mag
Apply pred1
3. Sample mag
Apply pred1
Sample light
Apply pred2
Quasar Group
Adapted from slides ©Sam Madden
Attribute Driven Topology Selection
• Observation: internal queries often over local
area
– Or some other subset of the network
• E.g. regions with light value in [10,20]
• Idea: build topology for those queries based on
values of range-selected attributes
– For range queries
– Relatively static trees
• Maintenance Cost
Quasar Group
Adapted from slides ©Sam Madden
Attribute Driven Query Propagation
SELECT …
WHERE a > 5 AND a < 12
4
[1,10]
[20,40]
Precomputed
intervals =
Semantic
Routing Tree
(SRT)
[7,15]
1
2
3
Early pruning
Quasar Group
Adapted from slides ©Sam Madden
Attribute Driven Parent Selection
1
2
[1,10]
3
[7,15]
[20,40]
Even without
intervals,
expect that
sending to
parent with
closest value
will help
[3,6]  [1,10] = [3,6]
4
[3,6]  [7,15] = ø
[3,6]
[3,6]  [20,40] = ø
Quasar Group
Adapted from slides ©Sam Madden
# of Nodes Visited (400 = Max)
Simulation Result
Nodes Visited vs. Query Range
450
400
350
300
250
Best Case (Expected)
Closest Parent
Random
Parent
Nearest Value
Snooping
200
150
100
50
0
0.001
0.05
0.1
0.2
0.5
Query Size as % of Value Range
1
(Random value distribution, 20x20 grid, ideal connectivity to (8)
neighbors)
Quasar Group
Adapted from slides ©Sam Madden
Acquisitional Query Processing
• How should the query be processed?
– Sampling as an operator, Power-optimal ordering
– Frequent events as joins
• Which nodes have relevant data?
– Semantic Routing Tree for effective pruning
• Nodes that are queried together route together
• Which samples should be transmitted?
– Pick most “valuable”?
– Adaptive transmission & sampling rates
Quasar Group
Adapted from slides ©Sam Madden
Adaptive Transmission Rates
Sample Rate vs. Delivery Rate
Aggregate Delivery Rate
(Packets/Second)
8
Adaptive = 2x
% Successful
Xmissions
7
6
5
4
3
1 mote
4 motes
4 motes, adaptive
2
1
0
0
2
4
6
8
10
12
Samples Per Second (Per Mote)
14
16
TinyDB monitors channel contention & backs-off as needed
Quasar Group
Adapted from slides ©Sam Madden
Prioritizing Data Delivery
• Score each item
• Send largest score
– Out of order -> Priority Queue
• Discard or aggregate when buffer is full
[1,2]
Quasar Group
Adapted from slides ©Sam Madden
Choosing Data To Send
Delta encoding
Time vs. Value
16
14
[1,2]
Value
(time, value)
12
10
8
6
4
2
0
1
2
3
4
Time
Quasar Group
Adapted from slides ©Sam Madden
Choosing Data To Send
Delta encoding
Time vs. Value
16
14
Value
12
[1,2]
10
8
6
4
2
0
1
2
3
4
Time
|2-15| = 13
[2,6]
Quasar Group
|2-6| = 4
[3,15]
[4,1]
Select which of
the 3 to send
|2-4| = 2
Adapted from slides ©Sam Madden
Choosing Data To Send
Delta encoding
Time vs. Value
16
14
Value
12
[1,2]
[3,15]
10
8
6
4
2
0
1
2
3
4
Time
[2,6]
Quasar Group
|2-6| = 4
[4,1]
Keep selecting
until hit max
delivery rate
|15-4| = 11
Adapted from slides ©Sam Madden
Choosing Data To Send
Delta encoding
Time vs. Value
16
14
Value
12
[1,2]
[3,15]
[4,1]
10
8
6
4
2
0
1
2
3
4
Time
[2,6]
Quasar Group
Adapted from slides ©Sam Madden
Choosing Data To Send
Delta encoding
Time vs. Value
16
14
Value
12
[1,2]
[2,6]
[3,15]
[4,1]
10
8
6
4
2
0
1
2
3
4
Time
If manage
to send all
Quasar Group
Adapted from slides ©Sam Madden
Delta + Adaptivity
• 8 element queue
• 4 motes
transmitting
different signals
• 8 samples /sec /
mote
Quasar Group
Adapted from slides ©Sam Madden
SURCH in the Sensor Database
Landscape
http://www.ics.uci.edu/~quasar
Data representation
•Precise
Type of query
•ad hoc aggregation
Query Evaluation
•In network
•distributed
Quasar Group
Data & Query
Location
•At sensors
SURCH Query Processing
• SURCH Query:
ON EVENT e
SELECT Attributes or Aggregates
FROM Sensors S
WHERE S.loc є Region
DESTINATION nodeID
• Event based Query UPON Predicate
– may initiate at any node in network
• Results accumulated at a specified destination
• Region specifies selection on sensors
• In network (fully distributed) query processing
Quasar Group
SURCH Query Processing
• Three Phases
– Neighborhood discovery
• broadcast based communication
– Query Propagation
• a sensor propagates if its neighborhood contains sensors to
which query not yet propagated
– Capture Partial results and route to destination
• a node holds partial results if it contains aggregate values that
are not broadcasted further
destination
result1
r1
initiator1
Q
generator
Q
Quasar Group
initiator2
result2
r2
Neighborhood Discovery
nn1
re-broadcast
nn2
ns
broadcast
response
nnk
– A node ns broadcasts query(e.g. MAX) and current result to all
neighbors.
– Neighbor nni responds with its value vni after waiting for a time period
(TTR) based on fitness of value
• node having data with highest “fitness” value responds first.
– If partial results change, immediate rebroadcast by ns to neighbors
• high likelihood that all neighbors learn the new MAX even without
responding
Quasar Group
Query Propagation
• 1-Dimensional illustration for a MAX query
• ni initiates a query
value
1
radio range
ni
Quasar Group
sensors
Query Propagation
• 1-Dimensional illustration for a MAX query
• ni initiates a query
value
2
1
2
radio range
ni
Quasar Group
sensors
Query Propagation
• 1-Dimensional illustration for a MAX query
• ni initiates a query
• nr1 and nr2 hold partial results.
value
6
5
4
3
2
1
2
3
radio range
nr2
nr1
Quasar Group
ni
sensors
Capture Partial Results
• Who have the partial results?
– Nodes whose results are not propagated further
• boundary of the query region
• irregular propagation frontier
– detected by remembering if any neighbor propagates
the query at next level.
• The partial results will be sent to a destination
node for final processing.
Quasar Group
Issue in Query Propagation
• Which nodes should broadcast query in network?
• Choose the broadcasting nodes based on
optimization goals:
– minimal overall cost
• minimum number of broadcasting nodes
• minimum size connected dominating set
– maximum network lifetime (uniform workload)
• take into account energy level of individual node.
• Heuristics to achieve optimization goals
– minimal overall cost
• choose based on number of undiscovered neighbors
– maximize lifetime
Quasar Group
• battery threshold
Simulation Results
• SURCH is very efficient at processing queries
that do not need response from every node:
Quasar Group
Summary of Query Processing
• Queries provide an expressive and easy to use interface for
programming sensors
– Rapid application development
– Transparent optimization
• Application writers can focus on the application logic and not how to
optimize it for sensor networks
• Query processing in sensor networks a difficult challenge
• Highly dynamic data, Energy/power constraints, Lossy, low bandwidth
broadcast based communication
– Standard approach of layering and isolating functionality into
relatively independent software components will not work. OS,
middleware, network, queries will require to be co-optimized
• Issues in query processing
– Where data resides, how is data represented, where queries are
initiated, where results need to be delivered, where queries are
processed
Quasar Group
Future Work in Query Processing in
Sensor Databases
• A rich sensor database research landscape
– No clear winners yet
• Many important open issues
– A formal semantics of query language
– A scalable architecture for sensor data gathering and query
processing
– Fault-tolerance and real-time constraints in query processing
– Integrating sensor data (and queries) with
• other sensor data (sensor data fusion)
• Other relational information
– XML and its role in sensor data
Quasar Group
Summary
• Sensor networks present a very wide range of system optimization
opportunities for power, application quality and performance
• Energy efficiency is a system level concern that cuts across subsystem
components, functionality layers and its implementations
• Key components
–
–
–
–
–
Low power sensor microarchitectures
Careful partitioning of functionality in distributed sensor network architecture
Energy aware operating systems
Query driven sensor data management
dynamic power management that coordinates capabilities against application
needs
• Real-time, fault-tolerance, application quality needs
– energy efficient communications and networking
• energy aware MAC, routing, transport
Quasar Group
Questions??
Quasar Group
Download