Node Clustering in Wireless Sensor Networks by Considering

advertisement
Node Clustering in Wireless Sensor
Networks by Considering Structural
Characteristics of the Network Graph
Nikos Dimokas1
Dimitrios Katsaros1,2
Yannis Manolopoulos1
1Informatics
Dept., Aristotle University, Thessaloniki, Greece
2Computer & Comm. Engineering Dept., University of Thessaly, Volos, Greece
4th ITNG Conference, Las Vegas, NV, 2-4/April/2007
1
Wireless Sensor Network (WSN)
Wireless Sensor Networks features
• Homogeneous devices
• Stationary nodes
• Dispersed Network
• Large Network size
• Self-organized
• All nodes acts as routers
• No wired infrastructure
• Potential multihop routes
2
Communication in WSN
• Communication between two unconnected nodes is achieved
through intermediate nodes.
• Every node that falls inside the communication range r of a
node u, is considered reachable.
3
WSN - Applications
• Applications
• Habitat monitoring
• Disaster relief
• Target tracking
• Many of these applications require simple and/or
aggregate function to be reported.
 Clustering allows aggregation and limits data
transmissions.
4
What is Clustering
Cluster member
Clusterhead
Gateway node
Intra-Cluster link
Cross-cluster link
• Nodes divided in virtual group according to some rules
• Nodes belonging in a group can execute different functions
from other nodes.
5
Clustering in WSN
• Involves grouping nodes into clusters and electing a CH
• Members of a cluster can communicate with their CH directly
• CH can forward the aggregated data to the central base station
through other CHs
• Clustering Objectives
•
•
•
•
Allows aggregation
Limits data transmission
Facilitate the reusability of the resources
CHs and gateway nodes can form a virtual backbone for
intercluster routing
• Cluster structure gives the impression of a smaller and more
stable network
• Improve network lifetime
• Reduce network traffic and the contention for the channel
• Data aggregation and updates take place in CHs
6
Relevant work – Clustering
• Based on the construction of Dominating Set
• Nodes belonging to the DS are carrying out all communication
• Running out of energy very soon
• Based on the residual energy of each node
• Proposed ways to rotate the role of CH among nodes of clusters
• Can be easily combined with the algorithms of the first family
• Our proposal : the GESC protocol supports
• dynamically estimation of CHs depending on the requester node,
and thus improvement of network lifetime
• a novel metric for characterizing node importance
• localization
• minimum number of messages exchanged among the nodes
7
Relevant work – Topology Control
MST
LMST
Minimum Spanning Tree (MST) and
Localized Minimum Spanning Tree
(LMST): Calculated with Dijkstra’s
algorithm and Li, Hou & Sha, respectively.
sample graph
w
u
u
v
Relative Neighborhood Graph (RNG): An
edge uv is included in RNG iff it is not the
longest edge in any triangle uvw.
v
uv not included
uv included
w
w
v
u
uv included
u
v
Grabriel Graph (GG): An edge uv is
included in GG iff the disk with diameter uv
contains no other node inside it.
uv not included
Delaunay Triangulation (DT), Partial Delaunay Triangulation (PDT), Yao graph (YG),
etc: A lot of other (variants of) geometric structures
Topology Control: Choosing a set of links from the possible ones.
Not exactly our problem. So graph-theoretic concepts, than geometric ones.
8
Minimal Dominating Set
• A vertex set is DS (Dominating Set)
• Any other vertex connected to one DS vertex
• It is CDS, if it is connected
• It is MCDS if its size is minimum among CDS
• Discovery of the MCDS of a graph is in NP-complete
DS
CDS
9
Motivation for new clustering protocol
• The protocol should:
• be localized, and thus distributed
• fully exploit the locally available information in making the
best decisions
• be computationally efficient
• minimize the number of message exchange among the nodes
• be energy efficient and thus extend network lifetime.
This could be achieved with the use of different nodes for
relaying messages
• not make use of “variants”, e.g., node IDs, because a (locally)
best decision might not be reached (even if it does exist)
10
Well-known CDS algorithm
Wu and Li’s algorithm
• Each node exchanges its neighborhood information
with all of its one-hop neighbors
• Any node with two unconnected neighbors
becomes a dominator (red)
• The set of all the red nodes form a CDS
11
Well-known CDS algorithm
Wu and Li’s algorithm (Pruning Rules 1 & 2)
Open neighbor set N(v) = {u | u is a neighbor of v}
Closed neighbor set N[v] = N(v)U{v}
v
u
v
u
A node v can be taken out
from the CDS if there exists a
node u such that N[v] is a
subset of N[u] and the ID of v
is smaller than the ID of u
u
v
w
A node u can be taken out from
the CDS if u has two neighbors
v and w such that N(u) is
covered by N(v)UN(w) and its ID
is the smallest of the other two
nodes’ IDs
12
Heed protocol (1/2)
• Every sensor node has multiple power levels.
• Periodically selects CHs according to a hybrid of the node
residual energy and node degree.
• TCP is the clustering process duration and TNO is the network
operation interval.
• Clustering is activated every TCP + TNO seconds.
• Initial number of CHs is Cprob.
• The probability of a node to become a CH is CHprob.
CH prob  C prob 
Eresidual
Emax
• The probability of a node to become a CH is CHprob.
13
Heed protocol (2/2)
• Intracluster – Intercluster communication
• Intracluster communication is proportional to:
• Node degree (load distribution)
• 1 / node degree (dense clusters)
• If variable power levels ara allowed for intracluster
communication then select CHs using average minimum
reachability power.
M
AMRP 
 MinPwr
i
i 1
M
14
Leach protocol (1/2)
• All nodes can transmit with enough power to reach the BS
and the nodes use power control.
• Cluster formation during set-up phase and data transfer
during steady-state phase.
• Each node elects itself as CH at the beginning of round r+1
with probability Pi(t). k is the number of clusters.
N
 P t 1  k
i 1
i
• All nodes are CHs the same number of times.
• All nodes have the same energy after N/k rounds.
15
Leach protocol (2/2)
• Every node elects as CH the node that requires the least
energy consumption for communication.
• Every CH set-up a TDMA schedule and transmitted to the
nodes. Every node could transmit data in the corresponding
time-slot.
 Weakness
• Limited scalability
• Could be complementary to clustering
techniques based on the construction of a DS
16
Weakness of current approaches
• Some approaches can not detect all possible eliminations
because ordering based on node ID prevents this. As a
consequence they incur significantly excessive retransmissions
• Others rely on a lot of “local” information, for instance
knowledge of k-hop neighborhood (k > 2), e.g., [WD04,WL04]
• Other methods are computationally expensive, incurring a
cost of O(f2) or O(f3), where f is the maximum degree of a node
of the ad hoc network, e.g., the methods reported in [WL01,
WD03, DW04] and [SSZ02]
• some methods (e.g., [QVLl00,SSZ02]) do not fully exploit the
compiled information; for instance, the use of the degree of a
node as its priority when deciding its possible inclusion in the
dominating set might not result in the best local decision
17
Terminology and assumptions
• WSN is abstracted as a graph G(V,E)
• An edge e=(u,v) exists if and only if u is in the transmission
range of v and vice versa. All links in the graph are
bidirectional.
• The network is assumed to be connected
• N1(v) : the set of one hop neighbours of v
• N2(v) : the set of two hop neighbours of v
• N12(v) : combined set of N1(v) and N2(v)
• LNv : is the induced subgraph of G associated with vertices in N12(v)
• dG(v,u) : distance between v and u
18
A new measure of node importance
• Let σuw=σwu denote the number of shortest paths from u  V to
w  V (by definition, σuu=0).
• Let σuw(v) denote the number of shortest paths from u to w
that some vertex v  V lies on.
• We define the node importance index NI(v) of a vertex v as:
• Large values for the NI index of a node v indicate that this
node can reach others on relatively short paths, or that v lies
on considerable fractions of shortest paths connecting others.
In the former case, it captures the fact of a possibly large
degree of node v, and in the latter case, it captures the fact
that v might have one (some) “isolated” neighbors
19
The NI index in sample graphs
In parenthesis, the NI index
of the respective node; i.e.,
7(156): node with ID 7 has
NI equal to 156.
Nodes with large NI:
Articulation nodes (in
bridges), e.g., 3, 4, 7, 16, 18
With large fanout, e.g., 14,
8, U
Therefore: geodesic nodes
20
The NI index in a localized algorithm
• For any node v, the NI indexes of the nodes in N12(v)
calculated only for the subgraph of the 2-hop (in general,
k-hop) neighborhood reveal the relative importance of the
nodes in covering N12
• For a node u (of the 2-hop neighbourhood of a node v), the
NI index of u will be denoted as NIv(u)
21
NI computation
• At a first glance, NI computation seems expensive, i.e.,
O(m*n2) operations in total for a 2-hop neighbourhood,
which consists of n nodes and m links:
• calculating the shortest path between a particular pair of vertices
(assume for the moment that there exists only one) can be done
using bfs in O(m) time, and there exist O(n2) vertex pairs
• Fortunately, we can do better than this by making some
smart observations. The improved algorithm
(CalculateNodeImportanceIndex) is quite complicated and
beyond the scope of this presentation
• THEOREM. The complexity of the algorithm
CalculateNodeImportanceIndex is O(n*m) for a graph
with n vertices and m edges
22
Pseudocode for
CalculateNodeImportanceIndex (1/2)
23
Pseudocode for
CalculateNodeImportanceIndex (2/2)
24
Evaluation setting (1/2)
• We compare GESC to:
• WL 1+2, improved scheme incorporating the rules indicated
• MPR, the MultiPoint Relaying method described in [QVL00]
• SSZ, reported in [SSZ02], which was selected as a Fast
Breaking Paper for October 2003
• Implementation of protocols using J-Sim simulation
library
• Sensor network topologies with 100, 300, 500 nodes.
• Each topology consists of square grid units
• Each sensor node is uniformly distributed between the point
(0,0) and (100,100)
• Two sensor nodes are neighbors if they are placed in the same
or adjacent grid units.
25
Evaluation setting (2/2)
• Varying levels of node degree from 4 to 10
• Run each protocol at least 100 times for each different
node degree. Each time a different node is selected to
start broadcasting
• Performance metric
• Energy dissipation
• Broadcast messages
• Latency
26
Impact of the #nodes (1/2)
27
Impact of the #nodes (2/2)
28
Impact of the average node degree
29
Impact of energy consumption
30
Conclusions and Future Work
• Defined and investigated a novel distributed clustering protocol
for WSN based on a novel localized metric
• The calculation of this metric is very efficient, linear in the
number of nodes and linear in the number of links
• Proved that it is very efficient in terms of communication cost
and in terms of prolonging network lifetime
• The protocol is able to reap significant performance gains,
reducing the number of rebroadcasting nodes
• Simulated an environment to evaluate the performance of the
protocol and competitive protocols using J-Sim simulator
• Comparison with protocols based on residual energy
(LEACH,HEED)
• GESC – GEodegic Sensor Clustering
has been proven to prevail
31
Download