Collecting Correlated Information from Wireless Sensor Networks Presented by Bo Han

advertisement

Collecting Correlated Information from Wireless Sensor Networks

Presented by Bo Han

Joint work with Aravind Srinivasan and Amol Deshpande

Outline

Introduction of wireless sensor networks

Communication architecture

Design factors of sensor networks

Applications of sensor networks

Correlated information collection

What we have done so far

Conclusion

Introduction

A sensor network is composed of a large number of sensor nodes, which are densely deployed either inside the phenomenon or very close to it.

Random deployment, the position of sensor nodes need not be engineered or predetermined.

Self-organizing capabilities.

Cooperative effort of sensor nodes.

Sensor Networks vs Ad-Hoc Networks

The number of nodes in a sensor network can be several orders of magnitude higher than the nodes in an Ad-Hoc network.

Sensor nodes are densely deployed.

Sensor nodes are prone to failures.

The topology of a sensor network changes very frequently.

Sensor Networks vs Ad-Hoc Networks

Sensor nodes mainly use broadcast, most ad hoc networks are based on point-to-point communication.

Sensor nodes are limited in power, computational capacities and memory.

Sensor nodes may not have global ID.

Communication Architecture

The sensor nodes are usually scattered in a sensor field.

Each of these scattered sensor nodes has the capabilities to collect data and route data back to the sink.

Data are routed back to the sink by a multi-hop infrastructureless architecture.

The sink may communicate with the task manager node via Internet or satellite.

Example of Sensor Networks

Data Delivery Models

Continuous: sensors communicate their data continuously at a prespecified rate.

Event driven: sensors report information only when the event of interest occurs.

Observer initiated (request-reply): sensors only reports their results in response to an explicit request from the observer.

Hybrid: all three approaches coexist.

Protocol Stack

Five Layers

The physical layer addresses the needs of simple but robust modulation, transmission, and receiving techniques.

The MAC protocol must be power-aware and able to minimize collision with neighbors ’ broadcasts.

The network layer takes care of routing the data supplied by the transport layer.

The transport layer helps to maintain the flow of data if the sensor networks application requires it.

Different types of application software can be built and used on the application layer.

Three Plans

The power management plane manages how a sensor node uses its power.

The mobility management plane detects and registers the movement of sensor nodes, so a route back to the user is always maintained, and the sensor nodes can keep track of who their neighbor sensor nodes are.

The task management plane balances and schedules the sensing tasks given to a specific region.

Fault tolerance

Design Factors

Scalability

Production costs

Hardware constraints

Transmission media

Power consumption

Sensor network topology

Environment

Fault Tolerance

Fault tolerance is the ability to sustain sensor network functionalities without any interruption due to sensor node failures.

The fault tolerance level depends on the application of the sensor networks.

Scalability

Depending on the application, the number may reach an extreme value of millions. New schemes must be able to work with this number of nodes.

Basically, the density gives the number of nodes within the transmission radius of each node in a region.

Must also utilize the high density of the sensor networks.

Production Costs

The cost of a single node is very important to justify the overall cost of the networks, since sensor networks consist of large number of sensor nodes.

The cost of a sensor node should be less than a dollar.

Transmission Media

In a multihop sensor network, communicating nodes are linked by a wireless medium.

To enable global operation, the chosen transmission medium must be available worldwide.

Radio, infrared and optical media.

Power Consumption

Limited power source.

Battery lifetime is limited.

Each sensor node plays a dual role of data originator and data router (data processor).

The malfunctioning of a few nodes consumes lot of energy (rerouting of packets and significant topological changes).

Sensor Network Topology

Pre-deployment and deployment phase, either thrown in as a mass or placed one by one.

Post-deployment phase, topology changes are due to change in sensor nodes ’ position, reachability, available energy, malfunctioning, and task details.

Re-deployment of additional nodes phase, additional sensor nodes can be redeployed at any time to replace malfunctioning nodes or due to changes in task dynamics.

Environment

Sensor nodes are densely deployed either very close or directly inside the phenomenon to be observed.

They usually work unattended in remote geographic areas.

They may be working in the interior of large machinery, at the bottom of an ocean, in a biologically or chemically contaminated field, in a battlefield beyond the enemy lines, and in a home or large building.

Applications of Sensor Networks

Military: Battlefield surveillance, Nuclear, biological and chemical attack detection and reconnaissance.

Environment: Forest fire detection, Flood detection.

Health: Telemonitoring of human physiological data, tracking and monitoring patients and doctors inside a hospital.

Home application: Home automation and smart environment.

Sensor Devices and Applications

Berkeley Motes

 iBadge - UCLA

MIT d'Arbeloff Lab – The ring sensor

Nose-on-a-chip

Zilog ’ s eZ80

 iButton

Berkeley Motes

Small (under 1 ” square) microcontroller.

It consists of:

– Microprocessor

– A set of sensors for temperature, light, acceleration and motion

– A low power radio for communicating with other motes

C compiler inclusion.

Berkeley Motes

iBadge - UCLA

Investigate behavior of children/patient.

Features:

– Speech recording / replaying

– Position detection

– Direction detection / estimation (compass)

– Weather data: temperature, humidity, pressure and light

iBadge - UCLA

MIT d'Arbeloff Lab – The Ring Sensor

An ambulatory, telemetric, continuous health monitoring device developed by d'Arbeloff Laboratory for Information Systems and

Technology at MIT.

Monitor the physiological status of the wearer and transmit the information to the medical professional over the Internet.

Clinical trials have been done in conjunction with

Massachusetts General Hospital's Emergency Room, and researchers are now working on commercialization of the ring-sized device.

Nose-on-a-chip

Nose-on-a-chip is a MEMSbased sensor, developed at

Oak Ridge National

Laboratory.

Can detect 400 species of gases and transmit a signal indicating the level to a central control station.

Consists of an array of tiny sensors on one integrated circuit and electronics on another.

The chip can be customized to detect virtually any chemical or biological species.

Zilog ’ s eZ80

Provides a way to internet-enabled process control and monitoring applications.

Temperature sensor, water leak detector and many more applications.

Enables users to access

Webserver data and files from anywhere in the world.

iButton

A 16mm computer chip armored in a stainless steel can.

Up-to-date information can travel with a person or object.

Correlated Information Gathering

Problem Definition

Correlation Modeling

Distributed Source Coding

Asymmetric Communication Channels

Our Approach

Fundamental Problem

Collecting information from distributed sources.

– Objective: correlations reduce bits that must be sent.

Correlation examples in sensor networks:

– Weather in geographic region.

– Similar views of same image.

Focus: information theory

– Number of bits sent.

– Ignore network topology .

Modeling Correlation

 k sensor nodes each have n -bit string.

Input drawn from distribution D .

– Sample specifies all kn bits.

– Captures correlations and a priori knowledge .

Objective:

– Inform server of all k strings.

– Ideally: nodes send H(D) bits.

H(D): Binary entropy of D. x

3 x

4 x

5 x

1 x

2 x

6 x k

Binary entropy of

D

  

 Pr

  log x  D

2

Pr

 



Optimal code to describe sample from D :

– Expected number of bits required ≈ H(D).

• Ranges from 0 to kn .

– E asy if entire sample known by single node.

• Idea: shorter codewords for more likely samples.

Challenge of this problem: input distributed.

(Distributed) Source Coding

Source coding is the process of encoding information using fewer bits than an unencoded representation, also called data compression.

DSC refers to the compression of the outputs of two or more physically separated sources. The sources do not communicate with each other.

These sources send their compressed outputs to a central point for joint decoding. DSC is part of network information theory.

DSC has recently become a very active research area – more than 30 years after Slepian and Wolf laid its theoretical foundation.

Distributed Source Coding

[Slepian-Wolf, 1973]:

– Simultaneously encode r independent samples.

– As r   ,

• Bits sent by nodes  rH(D).

• Probability of error  0.

Drawback: relies on r  

– Recent research: try to remove this.













 x

1

1 x

2

1

... x r

1 x

1

2 x

2

2

... x r

2 x

1

3 x

2

3

... x r

3 x

1

4 x

2

4

... x r

4 x

1

5 x

2

5

... x r

5 x

1

6 x

2

6

... x r

6 x

1 k x

2 k

... x r k

Slepian-Wolf Theorem

Self-Coding & Foreign Coding

Network Coding

Network coding is a field of information theory and coding theory.

A method of attaining maximum information flow in a network.

The core notion of network coding is to allow mixing of data at intermediate network nodes.

A receiver sees these data packets and deduces from them the messages that were originally intended for that data sink.

Butterfly Network

New approach

Allow interactive communication!

– Nodes receive “ feedback ” from server.

– Server at least as powerful as nodes.

Power utilization:

– Central issue for sensor networks.

– Node sending is power intensive.

– Node receiving requires less power.

.

x

1 x

2 x

3 x

4 x

5 x k

New approach

Communication model:

– Synchronous rounds:

• Nodes send bits to server.

• Server sends bits back to nodes.

• Nothing directly between nodes .

– Asymmetric Communication

Channels

Objectives:

– Minimize bits sent by nodes.

• Ideally O(H(D)+k) .

– Minimize bits sent by server.

– Minimize rounds of communication.

x

1 x

2 x

3 x

4 x

5 x k

Who knows what?

D

X

1

, D X

2

, D X

3

, D X

4

, D X

5

, D X

6

, D X

7

, D X

8

, D

Nodes: only know own string.

– Can also assume they know distribution D.

Server: knows distribution D.

– Typical in work on DSC.

– Some applications: D must be learned by server

• Most such cases: D varies with time.

• Crucial to have r as small as possible.

Fingerprint Function

A class of functions that map an n-bit string to a single bit such that if f is chosen uniformly at random from this class, then for any y

1

≠ y

2

Pr[f(y

1

) = f(y

2

)] ≤ p, for some p › ½

An n 3 x n matrix with each item chosen uniformly at random from {0, 1}.

Correlation

Correlation: Multivariate Normal Distribution.

Put sensor nodes uniformly at random in a unit square.

Covariance matrix.

– Diagonal items are all 1

– Cov(x, y) = d(x, y) r

Discretization of this continuous distribution.

Joint entropy estimation H(D).

Sample Set Generation

Determine the size of sample set.

Compute the Cholesky decomposition (matrix square root) of Σ, that is, find the unique lower triangular matrix A such that AA T = Σ.

Let Z = (z

1

, …, z n

) T be a vector whose components are n independent standard normal variates (which can be generated, for example, by using the Box-

Muller transform).

Let X be μ+ AZ (here, μ= 0).

The Algorithm

The algorithm can figure out the sensor node string with high probability in log

2

(H(D)) phases.

Each phase has two rounds.

First round: sensor nodes send fingerprint bits to the server.

Second round: server sends feedback to the sensor nodes.

Phase i – First Round

Node sends some number of fingerprint bits to the server, each specified by a fingerprint function chosen randomly.

When each fingerprint bit is processed, the inputs in the sample set that do not agree with that bit are discarded from consideration.

Before processing each fingerprint bit sent by the node, the server checks to see if it is an unbalanced bit: if some string x agrees with more than half of the elements of the sample set.

The string is called heavy string.

Phase i – Second Round

The server sends the heavy string to the node.

Node responses with a single indicator bit.

The server keeps track of the total number of balanced bits that it has received.

The server continues this process until it has received some predetermined number of balanced bits or the sample set is empty.

Conclusions

Lots of interesting problems in wireless sensor networks.

New technique for collecting distributed and correlated information (ongoing project).

Allows for arbitrary distributions and correlations.

Single sample sufficient.

Open problems:

– Lower bound on rounds.

– Incorporating network topology.

Reference

Top conference: MobiCom, MobiHoc, Sensys, IPSN, Infocom, SECON,

SODA, STOC, FOCS.

I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, Wireless

Sensor Networks: a Survey, Computer Networks 38 (2002) 393—422.

Z. Xiong, A. Liveris, and S. Cheng, Distributed Source Coding for Sensor

Networks, IEEE Signal Processing Magazine 21 (2004) 80—94.

A. Ramamoorthy, K. Jain, P. Chou, and M. Effros, Separating Distributed

Source Coding from Network Coding, IEEE/ACM Transactions on

Networking 14 (2006) 2785—2795.

J. Liu, M. Adler, D. Towsley, and C. Zhang, On Optimal Communication Cost for Gathering Correlated Data through Wireless Sensor Networks, In

Proceedings of MobiCom 2006.

T. Batu, S. Dasgupta, R. Kumar, and R. Rubinfeld, The Complexity of

Approximating Entropy, In Proceedings of STOC 2002.

M. Adler, Collecting Correlated Information from a Sensor Network, In

Proceedings of SODA 2005.

Questions?

Download