Clustering in Sensor Networks Why Clustering? – The data collected by each sensor is communicated through the network to a single processing center that uses the data – Clustering sensors into groups such that sensors communicate information only to clusterheads and then the clusterheads communicate the aggregated information to the processing center, saving energy and bandwidth – The cost of transmitting a bit is higher than a computation; therefore, it may be beneficial to organize the sensors into clusters – Cluster-based control structures provides more efficient use of resources in wireless sensor networks Clustering can be used for – Transmission management – Backbone formation – Routing Efficiency An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] – This paper proposes a distributed, randomized clustering algorithm to organize the sensors in a wireless sensor network into clusters to minimize the energy used to communicate information from all nodes to the processing center – By the generation of hierarchy of clusterheads, the energy savings increase with the number of levels in the hierarchy – Sensor detects events and then communicate the collected information to a central location where parameters characterizing these events are estimated – In the clustered environment, the data gathered by the sensors is communicated to the data processing center through a hierarchy of clusterheads – The processing center determines the final estimates of the parameters using information communicated by the clusterheads An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] – The processing center can be a specialized device or one of the sensors itself – In such clustered environment, sensor data is communicated over smaller distances, the energy consumed in the network will be much lower than the energy consumption when every sensor communicates directly to the information processing center – The results in stochastic geometry are used to derive values of parameters for the algorithm that minimize the energy spent in the sensor network An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Single-Level Clustering Algorithm – Each sensor becomes a clusterhead (CH) with probability p and advertises itself as a clusterhead to the sensors within its radio range – these clusterheads are called volunteer clusterheads – This advertisement is forwarded to all the sensors that are no more than k hops away from the clusterhead – Any sensor node that is not clusterhead itself receiving such advertisement joins the cluster of the closest clusterhead – Any sensor node that is neither a clusterhead nor has joined any cluster itself becomes a clusterhead – called forced clusterheads – Since the advertisement forwarding has been limited to k hops, if a sensor does not receive a CH advertisement within time duration t (where t is the time required for data to reach the CH from any sensor k hops away), it means that the sensor node is not within k hops of any volunteer CHs An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Single-Level Clustering Algorithm – Therefore, the sensor node becomes a forced clusterhead – The CH can transmit the aggregated information to the processing center after every t units of time since all the sensors within a cluster are at most k hops away from the CH – The limit on the number of hops allows the CH to reschedule their transmissions – This is a distributed algorithm and does not demand clock synchronization between the sensors – The energy consumed for the information gathered by the sensors to reach the processing center will depend on the parameters p and k – Since the objective of this work is to organize sensors in clusters to minimize the energy consumption, values of the parameters (p and k) must be found to ensure the goal An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Single-Level Clustering Algorithm Assumptions made for the optimal parameters are as follows: – The sensors are distributed as per a homogeneous spatial Poisson process of intensity λ in 2-dimensional space – All sensors transmit at the same power level – have the same radio range r – Data exchanged between two communicating sensors not within each others’ radio range is forwarded by other sensors – A distance of d between any sensor and its CH is equivalent to – Each sensor uses 1 unit of energy to transmit or receive 1 unit of data – A routing infrastructure is in place; when a sensor communicates data to another sensor, only the sensors on the routing path forward the data – The communication environment is contention- and error-free; sensors do not have to retransmit any data d / r hops An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Hierarchical Clustering Algorithm – This algorithm is extension of the previous one by allowing more than one level of clustering in place – Assume that there are h levels in the clustering hierarchy with level 1 being the lowest level and level h being the highest – The sensors communicate the gathered data to level-1 clusterheads (CHs) – The level-1 CHs aggregate this data and communicate the aggregated data to level-2 CHs and so on – Finally, level-h CHs communicate the aggregated data or estimates based on this aggregated data to the processing center An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Hierarchical Clustering Algorithm – The cost of communicating the information from the sensors to the processing center is the energy consumed by the sensors to communicate the information to level-1 CHs, plus the energy consumed by the level-1 CHs to communicate the aggregated data to level-2 CHs, …., plus the energy consumed by the level-h CHs to communicate the aggregated data to the information processing center Algorithm Details – The algorithm works in a bottom-up fashion – First, it elects the level-1 clusterheads, then level-2 clusterheads, and so on An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Hierarchical Clustering Algorithm Algorithm Details – Level-1 clusterheads are chosen as follows: o Each sensor decides to become a level-1 CH with certain probability p1 and advertises itself as a clusterhead to the sensors within its radio range o This advertisement is forwarded to all the sensors within k1 hops of the advertising CH o Each sensor receiving an advertisement joins the cluster of the closest level-1 CH; the remaining sensors become forced level-1 CHs – Level-1 CHs then elect themselves as level-2 CHs with a certain probability p2 and broadcast their decision of becoming a level-2 CH – This decision is forwarded to all the sensors within k2 hops An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] A New, Energy-Efficient, Hierarchical Clustering Algorithm Algorithm Details – The level-1 CHs that receive the advertisement from level-2 CHs joins the cluster of the closest level-2 CH; the remaining level-1 CHs become forced level-2 CHs – Clusterheads at level 3, 4, 5,…,h are chosen in similar fashion with probabilities p3, p4, p5,...,ph respectively to generate a hierarchy of CHs, in which any level-i CH is also CH of level (i-1), (i-2),…,1. An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] Advantages: – It is considered one of the earliest clustering algorithms in sensor networks that incorporates energy efficiency into the design of the algorithm – Since it is distributed algorithm, there is no need for clock synchronization between sensor nodes – It achieves not only better energy efficiency, but also better time complexity compared to previous work – The sensor nodes considered are simple nodes with fixed power level of transmissions – Since the algorithm is run periodically, the probability of becoming a clusterhead for each period is chosen to ensure that every node will get a chance to become clusterhead – providing the functionality for load balancing An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] Advantages: – Another approach to ensure load balancing is to trigger the algorithm when the energy levels fall below a certain threshold – Energy savings increases as the density of the sensor nodes increases for single level clustering – For the hierarchical clustering algorithm, the energy savings increase for (i) networks of sensors with lower communication radius, (ii) lower density of sensors in the network, and (iii) increase in the number of hierarchy levels An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] Disadvantages: – The energy consumption of clusterheads has not been addressed since these nodes will involve with more computation and communication of data to higher level clusterheads – consequence of non-uniform power consumption on the performance of the overall sensor network in the long run – An ideal network is assumed (contention- and error-free) which may not reflect the real life scenarios – Possible load imbalance between different clusters – Overhead associated with the clusterheads selection is not considered – How does the network cope with sensor node failures? How is detected and remedied? – How does the network handle information sent by faulty sensors? An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] Disadvantages: – How many forced-clusterheads can the sensor network handle? What is the upper bound? What are the guarantees that forced-clusterhead will be able to communicate with the neighboring clusterheads? – Similarly, what is the upper bound on the number of sensor nodes within one cluster? – Energy is wasted by those sensor nodes closer to the processing center than their CH, but still need to go through their CH An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] Suggestions/Improvements/Future Work: – What happens if a sensor node receives several join advertisements from multiple nearby clusterheads? How does the sensor node decides which one to join? Possible solution: the decision can be made to join to the cluster with the minimum number of members such that sensor nodes are evenly distributed among the clusters – Error and contention in communication is not considered Possible solution: results may be verified with the real MAC protocol and traffic conditions under a simulator or a test-bed – The capabilities of the processing center should be more than the regular sensor nodes An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] Suggestions/Improvements/Future Work: – Further energy efficiency can be achieved if the clusterheads can be in active or inactive mode (energy saving mode) – Depending on the distance from the clusterheads, the sensor nodes may choose to transmit data towards clusterhead in various power levels (for instance, low vs. high) – In multi-hop mode, the sensor nodes closest to the clusterhead have the most energy drainage due to data forwarding Possible solution: a scheme allowing the sensor nodes to alternate between single-hop and multiple-hop mode periodically Energy-Efficient Communication Protocol Architecture for Wireless Microsensor Networks (LEACH Protocol) [Heinzelman+ 2000, 2002] – LEACH (Low-Energy Adaptive Clustering Hierarchy) is a clustering-based protocol that utilizes the randomized rotation of local cluster base stations to evenly distribute the energy load within the network of sensors – It is a distributed, does not require any control information from base station (BS) and the nodes do not need to have knowledge of global network for LEACH to function – The energy saving of LEACH is achieved by combining compression with data routing – Key features of LEACH include: Localized coordination and control of cluster set-up and operation Randomized rotation of the cluster base stations or clusterheads and their clusters Local compression of information to reduce global communication LEACH [Heinzelman+ 2000, 2002] – Considered microsensor network has the following characteristics: The base station is fixed and located far from the sensors All the sensor nodes are homogeneous and energy constrained – Communication between sensor nodes and the base station is expensive and no high energy nodes exist to achieve communication – By using clusters to transmit data to the BS, only few nodes need to transmit for larger distances to the BS while other nodes in each cluster use small transmit distances – LEACH achieves superior performance compared to classical clustering algorithms by using adaptive clustering and rotating clusterheads; assisting the total energy of the system to be distributed among all the nodes – By performing load computation in each cluster, amount of data to be transmitted to BS is reduced. Therefore, large reduction in the energy dissipation is achieved since communication is more expensive than computation LEACH [Heinzelman+ 2000, 2002] Algorithm Overview – The nodes are grouped into local clusters with one node acting as the local base station (BS) or clusterhead (CH) – The CHs are rotated in random fashion among the various sensors – Local data fusion is achieved to compress the data being sent from clusters to the BS; resulting the reduction in the energy dissipation and increase in the network lifetime – Sensor elect themselves to be local BSs at any any given time with a certain probability and these CHs broadcast their status to other sensor nodes – Each node decided which CH to join based on the minimum communication energy – Upon clusters formation, each CH creates a schedule for the nodes in its cluster such that radio components of each non-clusterhead node need to be turned OFF always except during the transmit time – The CH aggregates all the data received from the nodes in its cluster before transmitting the compressed data to BS LEACH [Heinzelman+ 2000, 2002] Algorithm Overview – The transmission between CH and BS requires high energy transmission – In order to evenly distribute energy usage among the sensor nodes, clusterheads are self-elected at different time intervals – The nodes decides to become a CH depending on the amount of energy it has left – The decisions to become CH are made independently of the other nodes – The system can determine the optimal number of CHs prior to election procedure based on parameters such as network topology and relative costs of computation vs. communication (Optimal number of CHs considered is 5% of the nodes) – It has been observed that nodes die in a random fashion – No communication exists between CHs – Each node has same probability to become a CH LEACH [Heinzelman+ 2000, 2002] Algorithm Details – The operation of LEACH is achieved by rounds – Each round begins with a set-up phase (clusters are selected) followed by steadystate phase (data transmission to BS occurs) 1. Advertisement Phase: – Initially, each node need to decide to become a CH for the current round based on the suggested percentage of CHs for the network (set prior to this phase) and the number times the node has acted as a CH – The node (n) decides by choosing a random number between 0 and 1 – If this random number is less than T(n), the nodes become a CH for this round – The threshold is set as follows: P T(n) = 1 – P * (rmod 1P ) 0 If n C G Otherwise P = desired percentage of CHs r = current round G = set of nodes that have not been CHs in the last 1/P rounds LEACH [Heinzelman+ 2000, 2002] Algorithm Details 1. Advertisement Phase: – Assumptions are (i) each node starts with the same amount of energy and (ii) each CHs consumes relatively same amount of energy for each node – Each node elected as CH broadcasts an advertisement message to the rest – During this “clusterhead-advertisement” phase, the non-clusterhead nodes hear the ads of all CHs and decide which CH to join – A node joins to a CH in which it hears with its advertisement with the highest signal strength 2. Cluster Set-Up Phase: – Each node informs its clusterhead that it will be member of the cluster 3. Schedule Creation: – Upon receiving all the join messages from its members, CH creates a TDMA schedule about their allowed transmission time based on the total number of members in the cluster LEACH [Heinzelman+ 2000, 2002] Algorithm Details 4. Data Transmission: – Each node starts data transmission to their CH based on their TDMA schedule – The radio of each cluster member nodes can be turned OFF until their allocated transmission time comes; minimizing the energy dissipation – The CH nodes must keep its receiver ON to receive all the data – Once all the data is received, the CH compresses the data to send it to BS Multiple Clusters – In order to minimize the radio interference between nearby clusters, each CH chooses randomly from a list of spreading CDMA codes and it informs its cluster members to transmit using this code – The neighboring CHs radio signals will be filtered out to avoid corruption in the transmission LEACH [Heinzelman+ 2000, 2002] Advantages: – Localized coordination to enable scalability, and robustness for dynamic networks – Incorporates data fusion into the routing protocol in order to reduce the amount of information transmitted to BS – Distributes energy dissipation evenly throughout the sensors, thus increasing the system lifetime of the network LEACH [Heinzelman+ 2000, 2002] Disadvantages: – How to decide the percentage of cluster heads for a network? The topology, density and number of nodes of a network could be different from other networks – No suggestions about when the re-election needs to be invoked – The clusterheads farther away from the base station will use higher power and die more quickly than the nearby ones LEACH [Heinzelman+ 2000, 2002] Suggestions/Improvements/Future Work: – Extensions can be included to have hierarchical clustering where each CH will communicate with “super-clusterhead” until the top layer of hierarchy in which the data needs to be sent to BS – The degree and remaining energy of a node may be considered as parameters to decide a clusterhead in a round. If a clusterhead with a limited power used up its power in a round, the data to be transmitting may be lost – Since TDMA schedule is used, a large delay may be introduced between event detection and notification at base station. Therefore, the protocol is not suitable for a real-time application Related Work Highest-Degree Heuristic [Gerla+ 1995, Parekh 1994] Computes the degree of a node based on the distance (transmission range) between the node and the other nodes The node with the maximum number of neighbors (maximum degree) is chosen to be a clusterhead and any tie is broken by the node ids Drawbacks: A clusterhead cannot handle a large number of nodes due to resource limitations Load handling capacity of the clusterhead puts an upper bound on the node-degree The throughput of the system drops as the number of nodes in cluster increases Related Work Lowest-ID Heuristic [Baker+ 1981a, 1981b, Ephremides+ 1987] The node with the minimum node-id is chosen to be a clusterhead A node is called a gateway if it lies within the transmission range of two or more clusters Distributed gateway is a pair of nodes that reside within different clusters, but they are within the transmission range of each other Drawbacks: Since it is biased towards nodes with smaller node-ids, leading to battery drainage It does not attempt balance the load for across all the nodes Related Work Node-Weight Heuristic [Basagni 1999a, 1999b] Node-weights are assigned to nodes based on the suitability of a node being a clusterhead The node is chosen to be a clusterhead if its node-weight is higher than any of its neighbor’s node-weights and any tie is broken by the minimum node ids Drawbacks: No concrete criteria of assigning the node-weights Works well for “quasi-static” networks where the nodes do not move much or move very slowly Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] A clusterhead can ideally support nodes – Ensures efficient MAC functioning – Minimizes delay and maximizes throughput A clusterhead uses more battery power – Does extra work due to packet forwarding – Communicates with more number of nodes A clusterhead should be less mobile – Helps to maintain same configuration – Avoids frequent WCA invocation A better power usage with physically closer nodes – More power for distant nodes due to signal attenuation Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] WCA Steps 1. Compute the degree dv each node v d v | N (v ) | dist v, v tx ' range ' ' v V , v v Coordinate distance, predefined transmission range. 2. Compute the degree-difference for every node v | d v | For efficient MAC (medium access control) functioning. Upper bound on # of nodes a cluster head can handle. Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] WCA Steps 3. Compute the sum of the distances Dv with all neighbors Dv dist v, v 3 2 12 ' v N (v ) ' 1 7 Energy consumption; more energy for greater dist. communication. Power required to support a link increases faster than linearly with distance. (For cellular networks) 17 13 14 16 6 4 15 5 Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] WCA Steps 4. Compute the average speed of every node; gives a measure of mobility Mv 1 T Mv T t 1 where X t , Y t X t X t 1 Y t Y t 1 2 and coordinates of the node v X t 1,Y t 1 at time t and 2 are the t 1 Yt Yt-1 time Xt-1 Xt Component with less mobility is a better choice for clusterhead. Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] WCA Steps 5. Compute the total (cumulative) time Pv a node acts as clusterhead Battery drainage = Power consumed 6. Calculate the combined weight Wv for each node Wv = w1Δv + w2Dv + w3Mv + w4Pv for each node 7. Find min Wv; choose node v as the cluster head, remove all neighbors of v for further WCA 8. Repeat steps 2 to 7 for the remaining nodes Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] Load Balancing Factor (LBF) It is desirable to balance the loads among the clusters Load balancing factor (LBF) has defined as (should be high) LBF where, nc xi nc i x i 2 is the number of clusterheads is the cardinality of cluster i N nc nc and is the average number of neighbors of a clusterhead (N being the total number of nodes in the system) Related Work Weighted Clustering Algorithm (WCA) [Chatterjee+ 2002] Connectivity For clusters to communicate with each other, it is assumed that clusterheads are capable of operating in dual power mode A clusterhead uses low power mode to communicate with its immediate neighbors within its transmission range and high power mode is used for communication with neighboring clusters Connectivity is defined as (for multiple component graph) connectivity size of largest component N Probability that a node is reachable from any other node References [Baker+ 1981a] D.J. Baker and A. Ephremides, A Distributed Algorithm for Organizing Mobile Radio Telecommunication Networks, Proceedings of the 2nd International Conference on Distributed Computer Systems, April 1981, pp. 476-483. [Baker+ 1981b] D.J. Baker and A. Ephremides, The Architectural Organization of a Mobile Radio Network via a Distributed Algorithm, IEEE Transactions on Communications COM-29(11), 1981, pp. 1694-1701. [Bandyopadhyay+ 2003] S. Bandyopadhyay and E.J. Coyle, An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks, IEEE INFOCOM 2003, San Francisco, CA, March 30 – April 3, 2003. [Basagni 1999a] S. Basagni, Distributed Clustering for Ad hoc Networks, Proceedings of International Symposium on Parallel Architectures, Algorithms and Networks, June 1999, pp. 310-315. [Basagni 1999b] S. Basagni, Distributive and Mobility-Adaptive Clustering for Multimedia Support in Multi-hop Wireless Networks, Proceedings of Vehicular Technology Conference, VTC, Vol. 2, 1999-Fall, pp. 889-893. [Chatterjee+ 2002] M. Chatterjee, S. K. Das and D. Turgut, WCA: A Weighted Clustering Algorithm for Mobile Ad hoc Networks, Journal of Cluster Computing (Special Issue on Mobile Ad hoc Networks), Vol. 5, No. 2, April 2002, pp. 193-204. [Ephremides+ 1987] A. Ephremides J.E. Wieselthier and D.J. Baker, A Design Concept for Reliable Mobile Radio Networks with Frequency Hopping Signaling, Proceedings of IEEE, Vol. 75(1), 1987, pp. 56-73. References [Gerla+ 1995] M. Gerla and J.T. Tsai, Multicluster, mobile, multimedia radio network, Wireless Networks, Vol. 1, No. 3, 1995, pp. 255-265. [Heinzelman+ 2002] W. Heinzelman, A.P. Chandrakasan and H. Balakrishnan, An Application-Specific Protocol Architecture for Wireless Microsensor Networks, IEEE Transactions on Wireless Communications, Vol. 1, No. 4, October 2002, pp. 660-670. [Heinzelman+ 2000] W. Heinzelman, A.P. Chandrakasan and H. Balakrishnan, Energy-Efficient Communication Protocol for Wireless Microsensor Networks, IEEE Proceedings of the Hawaii International Conference on System Sciences, January 4-7, 2000, Maui, Hawaii. [Parekh 1994] A.K. Parekh, Selecting Routers in Ad-hoc Wireless Networks, Proceedings of the SBT/IEEE International Telecommunications Symposium, August 1994.