Growth Codes: Maximizing Sensor Network

advertisement
Growth Codes:
Maximizing Sensor Network Data Persistence
Abhinav Kamra, Vishal Misra, Dan Rubenstein
Department of Computer Science, Columbia University
Jon Feldman
Google Labs
DNA Research Group
1
Outline
 Problem
Description
 Solution Approach: Growth Codes
 Experiments and Simulations
 Conclusions and Ongoing work
ACM Sigcomm 2006
2
Background: A generic sensor network
Sensor Nodes
Sink(s)
x1
Sensed Data
x9
x2
x10
Data follows
multi-hop path
to sink(s)
A few node failures can
break the data flow
x12
x3
x11
x8
x5
x6
x13
x7
Generic Aim: Collect data from
all nodes at sink(s)
x
4
ACM Sigcomm 2006
3
Specific Context: Disaster Scenarios
 e.g.,
Monitoring earthquakes, fires,
floods, war zones
 Problems in this setting

Congestion near sink(s)
 All nodes simultaneously forward data
 Overwhelm sink(s) capacity
Virtual queue:
Congestion near sink
ACM Sigcomm 2006
4
Specific Context: Disaster Scenarios - 2
 Problems

in this setting
Network Collapsing: nodes failing rapidly
 Pre-computed routes may fail
 Data from failed nodes can be lost
 Data Recovery from subset of nodes acceptable
ACM Sigcomm 2006
5
Challenges

Networking Challenges:





Coding Challenges:



Disaster scenarios: feedback often infeasible
Frequent disruptions to routing tree if setup
Difficult to predict node failures: sink locations
unknown, surviving routes unknown
Difficult to synchronize nodes’ clocks
Data source distributed (among all sensor nodes)
Prior approaches (Turbo codes, LDPC codes) aim at fast
complete recovery
Sensor nodes have very limited memory, CPU,
bandwidth
ACM Sigcomm 2006
6
Data
Objectives
Persistence
Fraction of data that eventually reaches the sink(s)
Sink
Preserve
data
from failed sensor
nodes
x
x
8
x3
x
Deliver data
to
6
x12
2
x9
+
x1
x10
6 of 10 symbols reach sink.
60%
sink(s) Persistence
as fast =as
possible
x11
=
x5
Maximize Data Persistence
ACM Sigcomm 2006
7
Limitations of Previous Work
 Channel
Coding based
(e.g. Turbo Codes [Anderson-ISIT94], LT Codes [Luby02])


Aim for complete recovery in minimum time
Difficult to implement with distributed sources
 Routing-based
(e.g. Directed Diffusion [Govindan00], Cougar [Yao-SIGMOD02])

Conjecture: Too fragile (disrupted easily) for
disaster scenarios
ACM Sigcomm 2006
8
Our Approach
 Two

main ideas
Randomized routing and replication
 Avoid actively maintaining routes
 Replicate data to increase data survival

Distributed channel codes (Growth Codes)
 Expedite data delivery & survivability
First
(to our knowledge)
ACM Sigcomm 2006
distributed channel codes
9
Outline
 Problem
Description
 Our Solution: Growth Codes
 Experiments and Simulations
 Conclusions and Ongoing work
ACM Sigcomm 2006
10
Network Assumptions
4
3
2
5
S
1
6
S
7




N node sensor network
Limited storage: each node stores small # of data units
Large storage at sink(s): sink receives codewords from
random node(s)
All sensed data assumed independent (no source coding)
ACM Sigcomm 2006
11
High Level View of the Protocol
4
1
2
3
Nodes send data at random times
(Current implementation: exponentially distributed timers)
ACM Sigcomm 2006
12
High Level View of the Protocol (2)
Symbols
4
Degree 1 codewords
1
2

0
Degree 2 codeword
Even if node 3 fails
Sender picks a random symbol
Node 3’s data survives
XORs it with its own symbol
K1
3
K3
After time K1, nodes start sending degree 2 codewords
ACM Sigcomm 2006
13
K2
High Level View of the Protocol (3)



After time K1, nodes start sending degree 2 codewords
After time K2, nodes start sending degree 3 codewords
.
.
After time Ki, nodes start sending degree i+1 codewords
What are good values for {Ki}?
0
Please refer to our paper
Note: No need to tightly synchronize clocks
(Times Ki can be out of sync at different nodes)
ACM Sigcomm 2006
14
K1
K3
K2
The Intuition behind Growth Codes
Codewords
When very few symbols decoded
Easy to decode low degree codewords
Set of symbols
decoded at Sink
time
ACM Sigcomm 2006
15
The Intuition behind Growth Codes(2)
Codewords
When significant number
of symbols decoded
Low degree codewords often redundant
Set of symbols
Higher degree codewords more likely to be useful
decoded at Sink
ACM Sigcomm 2006
16
Outline
 Problem
Description
 Growth Codes
 Simulations and Experiments
 Conclusions and Ongoing work
ACM Sigcomm 2006
17
Simulations/Experiments:
Compare data persistence of various approaches
1.
Simulations:


2.
Centralized Setting: compare GC with other
channel coding schemes
Distributed Simulation: assess large-scale
performance of coding vs no coding
Experiments on motes:


Compare time of complete recovery for GC vs
routing
Measure resilience to node failures
ACM Sigcomm 2006
18
Comparison with various coding schemes
(N = 1500)
Centralized Simulation
(to compare with other channel coding schemes
for which only centralized versions exist)
Single source, single sink
 Source generates random codewords
No coding
is fast
beginning:
slowdown
explained via
according
toincoding
scheme
(GC,isSoliton)
Coupon Collector’s problem
Sink
 Zero failure rate

Soliton/ R-Soliton: poor partial recovery (reason: high
1
degree
codewords sent too early)
Growth Codes closest to theoretical upper bound
Sourceright degree at the right time)
(reason:
19
ACM Sigcomm 2006
Growth Codes vs No Coding
(Varying N)
Distributed Simulation
(to assess the performance gain of coding)



N sources, single sink
Random graph topology (avg degree 10)
Sink receives 1 codeword per time unit
Complete recovery takes:
O(N logN) time without coding (Coupon Collector’s effect)
Linear time with Growth Codes
Soliton/R-Soliton: cannot compare in a distributed setup
ACM Sigcomm 2006
20
Experiments with (micaz) motes
(to measure data persistence with time)
GC vs TinyOS’s “MultiHop” routing protocol
 No routing state at time 0 (scenario where
sensor nodes are deployed rapidly)

Experimental Topology
S
“MultiHop” for persistence: takes long time to complete route setup
Comparison with GC simulator validates simulator performance
ACM Sigcomm 2006
21
Motes experiments:
Resilience to node failures
Nodes generate data every 300 seconds
 3 nodes fail just after 3rd data generation

“MultiHop” sets up routing
3 random nodes fail
Nodes generate
S data
0
600
300
Experimental Topology
ACM Sigcomm 2006
Nodes send data
22 to sink
900
“MultiHop” repairs routes
Motes experiments:
Resilience to node failures
1st generation: GC faster,
MH takes time to setup routes
2nd generation: routing
already setup, MH very fast
3rd generation: MH needs to
repair routes
“MultiHop” repairs routes
“MultiHop” sets up routing
3 random nodes fail
Nodes generate data
0
600
300
ACM Sigcomm 2006
23
Nodes send data to sink
900
Other Results: Please refer to our paper
 Good
values for K1, K2, …
 More simulations/experiments


Various topologies
Other failure scenarios
 Implementation



details:
Memory usage at sensor nodes: how it
affects performance
How to handle periodic data generation
How to reduce overhead of coefficients
ACM Sigcomm 2006
24
Conclusions
 Data



persistence in sensor networks:
First distributed channel codes (GC)
Protocol requires minimal configuration
Is robust to node failures
 Simulations
and experiments on micaz
motes show, (compared to prior coding and routing methods)


GC achieves complete recovery faster
GC recovers more partial data at any time
ACM Sigcomm 2006
25
Ongoing Work
 Adapt
Growth Codes to scenarios where
sensor data is correlated
 Take advantage of any available routing
information (e.g. before a disaster)
 Estimate network size on the fly to use
in Growth Codes
ACM Sigcomm 2006
26
Thanks for your patience !
For more information
DNA Research Lab, Columbia University
http://dna-wsl.cs.columbia.edu/
ACM Sigcomm 2006
27
Download