An Agenda for Information Theory Research in Sensor Networks Outline • Introduction

advertisement
An Agenda for Information Theory
Research in Sensor Networks
Outline
• Introduction
• The Conventional Paradigm
• The Emerging Paradigm
• New Theory Challenges
Greg Pottie
UCLA EE Department
Center for Embedded Networked Sensing
Pottie@icsl.ucla.edu
Introduction
• Much research has focused upon sensor networks with
some alternative assumption sets:
– Memory, processing, and sensing will be cheap, but
communications will be dear; thus in deploying large numbers of
sensors concentrate on algorithms that limit communications but
allow large numbers of nodes
– For the sensors to be cheap, even the processing should be
limited; thus in deploying even larger numbers of sensors
concentrate on algorithms that limit both processing and
communications
• In either case, compelling theory can be constructed for
random deployments with large numbers and flat
architectures
Theory for Dense Flat Networks of Simple
Nodes
• Redundant communications pathways given unreliable
radios
• Data aggregation and distributed fusion
– Combinations with routing, connections with network rate distortion
coding
•
•
•
•
Scalability
Density/reliability/accuracy trades
Cooperative communication
Adaptive fidelity/network lifetime trades
What applications?
• Early research concentrated on short-term military
deployments
– Can imagine that leaving batteries everywhere is at least as
acceptable as leaving depleted uranium bullets; careful
placement/removal might expose personnel to danger
– Detection of vehicles (and even ID of type) and detection of
personnel can be accomplished with relatively inexpensive sensors
that don’t need re-calibration or programming in the field
• Story was plausible…
But was this ever done?
• Military surveillance
– Largest deployment (1000 nodes or so) was in fact hierarchical and
required careful placement; major issues with radio propagation
even on flat terrain
– Vehicles are really easy to detect with aerial assets, and the major
problem with personnel is establishment of intent; this requires a
sequence of images
– Our major problems are not battles, but insurgencies, which
demand much longer-term monitoring as well as concealment
• Science applications diverge even more in basic
requirements
– Scientists want to know precisely where things are; cannot leave
heavy metals behind; many other issues
• Will still want dense networks of simple nodes in some
locations, but will be system component
Sampling and Sensor Networks
• Basic goal is to enable new science
– Discover things we don’t know now
– Do this at unprecedented scales in remote locations
• This is a data-driven process: measure phenomena, build
models, make more measurements, validate or reject
models, … continue
• Spatiotemporal sampling: a fundamental problem in the
design of any ENS system
– Spatial: Where to measure
– Temporal: How often to measure
• (Nearly) all problems in ENS system design are related to
sampling: coverage, deployment, time-sync, datadissemination, sufficiency to test hypotheses, reliability…
Adaptive Sampling Strategies
• Over-deploy: focus on scheduling which
nodes are on at a given time
• Actuate: work with smaller node densities,
but allow nodes to move to respond to
environmental dynamics
• Our apps are at large scales and highly
dynamic: over-deployment not an option
– Always undersampled with respect to some
phenomenon
– Focus on infrastructure supported mobility
– Passive supports (tethers, buoyancy)
– Small number of moving nodes
• Will need to extend the limited sets of
measurements with models
Evolution to More Intelligent Design
• Early sensor network research focused on resource
constrained nodes and flat architecture
– High density deployments with limited application set
• Many problems with this flat architecture
– Software is nightmarish
– Always undersample physical world in some respect
– Logistics are very difficult; usually must carefully place, service, and
remove nodes
• The major constraint in sustained science observations is
the sensor
– Biofouling/calibration: must service the nodes
• Drives us towards tiered architecture that includes mobile
nodes
– Many new and exciting theory problems
Some Theory Problems
• Data Integrity
– Sufficiency of network components/measurements to trust
results
• Model Uncertainty
– Effects on deployment density, number of measurements
needed given uncertainty at different levels
• Multi-scale sensing
– Information flows between levels; appropriate populations at
the different levels given sensing tasks
– Local interactions assume increased importance
• Logistics management
– Energy mules
– Mobile/fixed node trades
Many Models
• Source Phenomena
– Discrete sets vs. continuous, coupling to medium,
propagation medium, noise and interference processes
• Sensor Transduction
– Coupling to medium, conversion to electrical signal, drift and
error sources
• Processing Abstractions
– Transformation to reduced representations, fusion among
diverse sensor types
• System Performance
– Reliability of components, time to store/transport data at
different levels of abstraction
Much Uncertainty
• Observations (Data)
– Noisy, subject to imperfections of signal conversion,
interference, etc.
• Model Parameters
– Weighting of statistical and deterministic components;
selection of model order
• Models
– Particular probability density function family, differential
equation set, or in general combination of components
• Goals and System Interactions
– Goals can shift with time, interactions with larger system not
always well-defined
Model and Data Uncertainty in Sensor Networks
• How much information is
required to trust either data or
a model?
• Approach: multi-level network
and corresponding models;
evaluation of sequence of
observations/experiments
Data Uncertainty
Multiple nodes observe source, exchange
reputation information, and then interact
with mobile audit node
Model Uncertainty
How many nodes must sample a
field to determine it is caused by
one (or more) point sources?
A Few Problems
• Validation (=debugging) is usually very painful
– One part design, 1000 parts testing
– Never exhaustively test with the most reliable method
• So how can we trust the result given all the
uncertainties?
– Not completely, so the design process deliberately minimizes the
uncertainties through re-use of trusted components
• But is the resulting modular model/design efficient?
– Fortunately not for academics; one can always propose a more
efficient but untestable design
• Our goal: quantifying this efficiency vs. validation effort
tradeoff in model creation for environmental applications
Universal Design Procedure
• Innovate as little as possible to achieve goals
– Applies to surprisingly large number of domains of human
activity.
• Begin with what we know
– E.g., trusted reference experiment, prior model(s)
• Validate a more efficient procedure
– Exploit prior knowledge to test selected cases
• Bake-off the rival designs or hypotheses
– Use your favorite measure of fitness
• Iterate
– Result is usually a composite model with many components
Example: Radio Propagation
• Model from First Principles: Maxwell’s Equations
–
–
–
–
Complete description (until we get to the scale of quantum dynamics)
Economy of principles
Computationally intractable for large volumes
Many parameters that must be empirically determined
• Practical approach: hybrid models
–
–
–
–
Start with geometric optics (rays+Huygen’s principle)
Add statistical models for unobserved or dynamic factors in environment
Choice of statistical models determined by geometric factors
Deeper investigation as required using either extensive observations or
occasional solution of Maxwell’s equations for sample volumes
– Level of detail in model depends on goals
Two-Level Models
• Each level in hierarchy contains reference experiments
– Trusted, but resource intensive and/or limited to particular scales
• Higher level establishes context
– Selects among set of models at lower level corresponding to each context
– Each of these sets contains a reference model/experimental procedure
• This system allows re-use of components
– Limits validation requirements
– Extensible to new environments and scales by adding new modules
Example: Fiat Lux
• Top level: camera/laser mapper providing context and wider area
coverage
– Direct locations for PAR sensors to resolve ambiguities due to ground
cover
• Modular model construction
– Begin with simple situations: pure geometric factors, calibration of
instruments
– Progress to add statistical components: swaying of branches, distributions
of leaves/branches at different levels of canopy, ground cover
• Resulting model is hybrid combination of:
– Deterministic causal effects
– Partially characterized causes (statistical descriptions)
• Level of detail depends on goals
– Reconstruction, statistics or other function of observations
Early Experiments
• Sensors with different modes
and spatial resolutions
– E.g. PAR sensor and camera
– PAR measures local incident
intensity
– Camera measures relative
reflected intensity
• Provides better spatial and
temporal resolution, at cost of
requiring careful calibration
• Analogous to remote sensing
on local scales
•
•
•
A homogeneous screen is
placed to create a reflection Er
proportional to incident light Ec.
Camera captures the reflection
on its CCD
The image pixel intensity is
transformed to Er using
camera’s characteristic curve.
If 2 levels are good, n levels are even better!
Daily Average
Temperature
(Geostatistical Analyst)
Extend model
to include remote
sensing;
additional levels
of “side information”
and/or sources for
data fusion
Aspect
(Spatial Analyst)
Slope
(Spatial Analyst)
Elevation
(Calculated from Contour
Map)
Aerial Photograph
(10.16cm/pixels)
Hourly Temperature for June 5 2004
30.000
Series1
Series2
Series3
Series4
25.000
Series5
Series6
Series7
Series8
Series9
Series10
Series11
Graphs
Temperature
20.000
15.000
Series12
Series13
Series14
Series15
Series16
Series17
Series18
10.000
5.000
0.000
0.000
5.000
10.000
15.000
Hour
3D Images
20.000
25.000
30.000
Series19
Series20
Layers and Modules vs. Tabula Rasa Design
• Fresh approach (e.g. “cross-layer design”) allows optimization
according to particular goals
– Yields efficiency of operation
– But may lack robustness, and requires much larger validation effort each
time new goals/conditions considered
– Size of model parameter set can be daunting
• Sequential set of experiments allows management of uncertainty at
each step
– Minimizes marginal effort; if each experiment or design in chain was of
interest, overall effort (likely) also minimized
– Naturally lends itself to Bayesian approach; many information theory
opportunities
– But has an overhead in terms of components not required for given
instantiation
• Research goal is quantification of efficiency/validation tradeoff
Conclusion
• Development of multi-layered systems
–
–
–
–
Physical phenomenon modeled at multiple abstraction layers
Hardware has many levels from tags to mobile infrastructure
Software abstractions and tools in support of this development
Theoretical study of information flows among these levels
• New and interesting problems arise from real deployments
– Even seemingly simple phenomena such as light patterns in forests
are amazingly complicated to model
– Approach through sequence of related experiments and models
Download