presentation slides

advertisement
TASC: Topology Adaptive Spatial
Clustering for Sensor Networks
Reino Virrankoski and Andreas Savvides
Yale University,
Embedded Networks and Applications Laboratory
http://www.eng.yale.edu/enalab/
Distributed Spatial Clustering


Sensor nets are inherently coupled to physical
space
Need a means to organize them in a meaningful
way



Spatial sampling
 When clustering is done with respect to network
topology, it enables higher data compression rate
than clustering that is done without consideration of
network topology
 One can create hierarchical structures from bottom to
top by applying distributed clustering
Transmission power control
 Clustering with respect to spatial attributes enables
lower transmission power rate in intra-cluster
comunication
Frequency allocation in dense networks
2
Clustering Target & Idea

TASC is a Distributed algorithm, that partitions network with
density non-uniformities into a set of smaller, nonoverlapping clusters by grouping nodes with similar density
attributes such that node density variation in individual
clusters is smaller than node density variation in the whole
network

Analogus of grid partitioning in deterministic deployments
Clustering in the case of uniform
node deployment:
TASC-clustering outcome in the
case of non-uniform node
deployment:
3
Assumptions
1) Each node has a static node id
2) Network is static with very low updates
3) Nodes can measure distances to other nodes in
their vicinity

Algorithm must tolerate measurement noise
4) Nodes proactively discover their 2-hop
neighborhood

Each node knows the neighboring nodes and measured
distances in its 2-hop neighborhood
4
Requirements & Objectives



Distributed or locally distributed solution
Node distribution and density inside the
clusters should be as uniform as possible
Clusters should be as round as possible
5
Cluster Evaluation Metrics

Cluster evaluation is based on Delaunay triangulation
By definition, a Delaunay triangulation of a finite set of points in
the plane is a triangulation that minimizes the standard
deviations of the angles of the triangles, using 60 degrees as
the mean
 It follows from the definition that the Delaunay triangulation
gives an optimal planar subdivision in terms of spatial
uniformity
 Relative node density variation is Delaunay triangle edge length
standard deviation in a cluster divided by average Delaunay
triangle edge length in that same cluster:

N
1
N
si
rel _ dv 
6
 (s  s )
i
i 1
s
2
Cluster Evaluation Metrics
 Cluster area is a sum of cluster Delaunay triangle areas:
A8
A3
A7
A5
A1
A
A2 A4 6
A12
A11
13
cluster _ area   Ai
A10 A13
i 1
A9
 Cluster density is the number of nodes in the cluster
divided by cluster area:
12
density 
nodes / m 2
A
A
7
Novelties




Algorithm uses internode distances => locations not needed
Network is partitioned into a set of locally uniform nonoverlapping clusters without prior knowledge of number of
clusters, cluster size and node coordinates
Distributed algorithm, where all needed information and
messaging is done in each node 2-hop environment
Each node compute its weight, based on shortest Euclidean
paths in its 2-hop environment


Each node applies dynamic density reachability criteria to
find out the nodes in its 2-hop environment, that are located
in similar or higher density area


Approximation of local center of mass
Grouping of nodes according their neighborhood density
properties
By applying weights and density reachability, node is able to
capture local distance, connectivity and density information
8
Weights



Come up with a weight for each node that
characterizes the network structure
Analogous to greedy forwarding in geographic
routing
 Nodes closer to the center of the network are
used more frequently as intermediate routes
 these would be the heaviest nodes in TASC
Use weights to drive leader election and clustering
inside the network
9
Weights Example

Main idea: Count the frequency a node is found on the
shortest path between two nodes
Node B is found on the following seven
shortest paths: AB, BC, BD, BE, AC, AD
and AE
A
B
C
D
E
7
When each node compute its weight in a
same way than B, following result is
achieved:
A
B
4
7
C
8
D
E
7
4
10
Weights Example
There is no variation in the
weights in an idealized
uniform case. However, the
smooth weight distribution
can give us at least the
information that the
network structure is
completly uniform, and
some simple gird-based
clustering method can be
applied.
In the non-uniform case,
the weights indicate the
centers of the local
structures.
11
Incorporating Distance Information

Weight computation based only on hops does not give
enough information in the case where hop paths are
symmetric, but the Euclidean lengths of the paths are
different. To handle this problem, we augment the weight
computation to incorporate distance information

Each time the node is used in the path, the weight is
incremented as a function of the distance a node contributes to
the path
length of edges contributed by a node
New weight 
length of path
0.49
B
A
0.86
3
1.29
D
5
10.15
C
E
G
4
1
11.46
0.84
12
H
0.51
Density Reachability



The information of node weights can be used to identify local
centers
In addition, nodes must be grouped in regions with similar
density attributes
Density reachability is applied traditionally in data clustering to
cluster spatial data in the presence of obstacles


Ester, M., Kriegel, H-P., Sander, J., Xu, X., ”A Density-Based Algorithm for Discovering Clusters in Large Spatial
Databases With Noise”, 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96), Portland,
Oregon, 1996
Zaïne, O. R., Lee, C-H, ”Clustering Spatial Data in the Precence of Obstacles: a Density-Based Approach”, Sixth
International Database Engineering and Applications Symposium (IDEAS 2002), Edmonton, Alberta, Canada, July 1719, 2002.
Figure from Zaïne et al, referenced above
13
Density Reachability

Based on the distances between a node and its closest
neighbors, each node must further limit its two hop
neighborhood to the subgroup of nodes, in which the density
in terms of distances is similar or higher:
Node i
Whole 2-hop
neighborhood of
node i
Subset of node i
2-hop
neighborhood,
where density in
terms of
distances is
similar or higher
14
Density Reachability

We apply an adaptive version of density reachability

Each node picks it own density range, based on parameter Dr,
that is given number of nodes (including node itself) that must
be located within density range from choosing node
ri
rj
Node i and node j pick their density
ranges when given Dr = 4.
i
j
Node i density range is the minimum
range when there is Dr nodes within
the disk centered in node i. Thus,
density range is spesific in each
node and carries information on
local density.
Dr is a parameter given as input to
the algorithm, and it defines the
resolution in which accuracy
algorithm differentiates between
more and less dense in local
density.
15
Density Reachability Example (1/5)

Node i select its density range ri
 When Dr = 4, density range is distance to Dr-1 =
3rd neighbor
i
r
16
i
Density Reachability Example (2/5)

All nodes within density range ri from node i are
density reachable
i
r
17
i
Density Reachability Example (3/5)

Each node that is in node i 2-hop environment and within ri
from some of the density reachable nodes, is density
reachable from node i
 Two new density reacable nodes are found within ri from
node j
i
j
18
r
i
Density Reachability Example (4/5)

One more node that is density reachable from node
i is found within ri from node k
i
j
k
19
r
i
Density Reachability Example (5/5)

By applying density reachability on given Dr = 4, node i has
traced down the subset of its 2-hop environment, where
density in terms of distances is higher or equal

Node i can choose its nominee only from its density reachable
nodes
i
20
Nodes that are
density reachable
from node i
Distributed Clustering Algorithm

Inputs:

2-hop neighborhood

Inter-node distance measurements in 2-hop
neighborhood

Parameter Dr for density reachability

Required minimum cluster size (minimum
number of nodes)
21
Distributed Clustering Algorithm Overview
At each node:
1. Compute weight from 2-hop neighborhood
2. Exchange weights by 2-hop broadcast
3. Nominate heaviest node in the density
reachable subset and broadcast to 2-hop
neighbors
4. Elect closest nominee as the leader
5. If number of nodes in the cluster > threshold
stop, else join the closest cluster with size >
threshold
22
Leader Election Example (1/4)
1.
Each node computes its own weight based on
shortest Euclidean paths in its 2-hop environment
Compute weight based on
shortest Euclidean paths in my 2hop environment
23
Leader Election Example (2/4)
2.
Each node finds its density reachable nodes
Find the subset of my 2-hop
environment, that is density
reacable
24
Leader Election Example (3/4)
3.
Each node selects the density reachable node
that has biggest weight as its nominee, and
broadcasts its nomination to its 2-hop
neighborhood
Nominate the node that has biggest
weight among my density reachable
nodes
Broadcast my nominee into my two
hop neighborhood
25
Leader Election Example (4/4)
4.
Each node receives all nominees in its two hop
neighborhood and elects the closest nominee to its
leader
My original nomination
Listen all nominees in my 2-hop
neighborhood
Select closest nominee to be my leader
Closest node that is in my 2 hop
neighborhood, and that is nominated
by some of my 2 hop neighbors
26
Simulation Setup


A set of simulations on a suite of 100 random scenarios having
100 nodes is each scenario deployed on a square deployment
field of size 1000 x 1000
Distance measurement range was assumed to be equal to the
communication range


In practise, it is expected that the communication range is greater
than the distance measurement range, but the equality assumption
does not violate the fundamendal properties of TASC
Each scenario was used five times over different connectivity
levels


The connectivity was varied by varying the maximum measurement
range from 200 to 400 in steps of 50
The respective average node connectivity in each case was 10.31,
15.31, 21.09 and 33.80. Even though it is relatively high, the density
variations in the network scenarios were so high that all nodes were
not connected to the network if the maximum range was decreased
from 200
27
Simulation Setup


Required minimum number of nodes per cluster was set to
4
Simulations were implemented with a combination of Matlab
and an in-house version of NeslSim


The main role of the NeslSim environment was the enforcing
of a distributed implementation of TASC
The computation of shortest paths was done using the FloydWarshall algorithm running at each node
28
Simulation Results: Cluster Consistency
TASC outcome remains consistent when the network connectivity
varies between 10 and 35. This was expected, because the
connectivity was varied by varying the maximum measurement
range, but the value of Dr was kept constant (Dr = 4). As a
consequence, the density reachable subsets did not change and
that keeps the clustering outcome in a same level.
20
Avg # of clusters and avg # of nodes/cluster
18
16
14
12
10
8
6
4
2
0
10
15
20
29
25
Average number of neighbors / node
30
35
Cluster Spatial Uniformity
Comparison between underlying network node density variation
and node density variation in its clusters shows obvious
improvement in the degree of spatial uniformity thus verifying
that TASC is able to cluster globally non-uniform network with
respect to locally more uniform node configurations that exists
in the network. The result is computed from 6697 clusters
outcome.
1
Std of Delaunay triangle edge lengths
0.9
0.8
0.7
0.6
0.5
0.4
0.3
clusters
whole network
network avg = 0.688
cluster avg = 0.521
0.2
0.1
0
0
10
20
30
40
50
30
Scenario #
60
70
80
90
100
Cluster Size and Cluster Density
Since a non-uniform network includes large density variations
and TASC groups nearby nodes together, the cluster size in
terms of number of nodes and in terms of cluster area is
inversely proportional to the cluster density. The existence of
that trend was verified by the simulation results.
-3
3
x 10
Density per cluster (nodes/m2)
2.5
2
1.5
1
0.5
0
0
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Nodes in cluster
31
25
The Effect of Density Reachability
If the value of Dr increases, the density range begins to approach the
maximum measurement range and the set of density reachable nodes
approaches the entire 2-hop neighborhood of the node. As a
consequence, cluster size increases and the resolution in which
accuracy TASC cluster the network with respect to local uniformity
becomes weaker.
The effect of density reachability when two different Dr values were
applied to same 100 scenarios suite:
0.012
0.012
Dr = 6
0.01
Density per cluster (nodes/m 2)
Density per cluster (nodes/m 2)
Dr = 4
0.008
0.006
0.004
0.002
0
0
10
20
30
Nodes in cluster
40
0.01
0.008
0.006
0.004
0.002
0
0
50
32
10
20
30
Nodes in cluster
40
50
Noise Tolerance
The distance measurement noise was modeled as additive noise
following a white Gaussian distribution that the standard deviation of
which was entered as a percentage of the measured distance.
TASC is able to obtain consistent cluster sizes with up to such noise
level, where additive noise standard deviation is 30% of measured
distance:
Average number of
nodes per cluster
40
35
30
25
20
15
10
5
0
0
10
20
30
40
50
60
Additive noise, white Gaussian distribution, std % of measured distance
0.8
Average std of Delaunay
triangle edge lengths
0.7
0.6
0.5
std
cluster avg
network avg
0.4
0
10
33 20
30
40
50
Additive noise, white Gaussian distribution, std % of measured distance
60
Clustering Outcome Example
34
Conclusions & Future Work



Locally distributed implementation: each node only
needs to be aware of its 2-hop neighborhood
The novel use of weights and density reachability
criteria
Simulations indicate that:





TASC can decompose large non-uniform networks into
smaller locally uniform clusters
TASC tolerates distance measurement noise up to a level,
where the standard deviation of Gaussian noise is 30% of
measured distance
Evaluate and optimize communication overhead
The parameters of TASC should be adapted to fit the
particular application needs
It is possible to repeat the weight-based election process to
construct hierarchies
35
Download