MULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS by Eylem İlker Oyman B.S. in Computer Engineering, Boğaziçi University, 1993 B.S. in Mathematics, Boğaziçi University, 1993 M.S. in Computer Engineering, Boğaziçi University, 1996 Submitted to the Institute for Graduate Studies in Science and Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy Graduate Program in Computer Engineering Boğaziçi University 2004 ii MULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS APPROVED BY: Prof. Cem Ersoy ................................................... (Thesis Supervisor) Prof. M. Ufuk Çağlayan ................................................... Prof. Bülent Örencik ................................................... Assoc. Prof. Can Özturan ................................................... Assist. Prof. Murat Zeren ................................................... DATE OF APPROVAL ................................................. iii ACKNOWLEDGEMENTS It has been a long story since this work had started and until it could receive an end. Many valuable people have contributed to this thesis, not only academically but also emotionally. First of all, I would like to thank to the professors in my thesis jury, Prof. M. Ufuk Çağlayan, Prof. Bülent Örencik, Assoc. Prof. Can Özturan, and Assist. Prof. Murat Zeren for their valuable comments and directions. Prof. Cem Ersoy, my thesis advisor, helped me to focus on the work and showed me the ways of being academically productive. The long running PhD study had many bureaucratic obstacles. However, our dear secretary Sevgi Dikmen was always able to find a clean solution. Without her helps, I could not survive in the jungle. Many special thanks… The people in the Netlab were always close, friendly and helpful. Their innovative ideas have raised the value of this work. Especially, Kaan Bür, my friend, roommate, neighbor, and travel-mate… I will never forget the taste of those repetitive lunches. In addition, Dr. Roy Küçükateş, being my partner since the stone-age, was really patient in the business and also helpful, especially in the early stages of my Opnet work. Finally, I want to thank to my family. I felt their blessing, support, encouragement and love always with me. And, my Esra, my dear wife… She is my light in the darkness, my oasis in the desert, my rescue island in the ocean. Having found her, the life has become a meaning. iv ABSTRACT MULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS Energy is the most critical resource in the life of a wireless sensor node. Therefore, its usage must be optimized to maximize the network life. Besides using power adjustable transmitter circuitry, usage of multi-hop communication links should be considered to save energy. Moreover, in large-scale networks with a large number of sensor nodes, multiple sink nodes should be deployed, not only to increase the manageability of the network, but also to reduce the energy dissipation at each node. In this thesis, we introduce problems that are related with locating multiple sink nodes in the sensor network area. We give a framework consisting of new formulations and definitions for the multiple sink sensor networks. Then, we investigate the use of multi-hop communication links and compare the amount of energy gain upon alternative routes using analytical techniques. We show that employing multi-hop links does not always result in energy gain, and try to quantify situations when it is advantageous. We also show that neglecting the overhead energy and overemphasizing the importance of power adjustable transmitter circuitry could result in considerable energy loss. The analytical results are validated using simulations on different scenarios. Then, we focus on the multiple sink location problems in large-scale wireless sensor networks. We propose a mathematical formulation for sensor networks to calculate the energy dissipation throughout the network. Then we state different problems depending on the design criteria. Finally, we consider locating sink nodes to the sensor environment, where we are given a time constraint that states the minimum required operation time. We use simulation techniques to test our solution. v ÖZET TELSİZ DUYARGA AĞLARINDA BİRDEN FAZLA MERKEZ YERLEŞTİRME PROBLEMİ VE ENERJİ VERİMLİLİĞİ Kablosuz algılayıcı aygıtlarının ömürleri açısından, enerji en önemli kaynaktır. Bu yüzden, ağın ömrünü en üst düzeye çıkarabilmek için, enerjinin kullanımı en iyi şekilde yönetilmelidir. Enerji tasarrufu için, çıkış gücünün ayarlanabildiği verici devrelerini kullanmanın yanı sıra, çok zıplamalı konuşma hatları kullanılmalıdır. Bununla birlikte, çok fazla sayıda algılayıcıdan oluşan büyük ölçekli algılayıcı ağlarında, veri toplamak için birden fazla merkez kurulmalıdır. Bu sayede, hem ağ daha kolay yönetilebilecek, hem de her bir algılayıcının enerji harcaması azaltılmış olacaktır. Bu tezde, birden fazla merkez düğümün algılayıcı ağı alanına yerleştirilmesi ile ilgili problemleri ortaya çıkardık. Birden fazla merkezli algılayıcı ağları için yeni ifadeler ve tanımlar içeren bir çerçeve oluşturduk. Daha sonra, çok hoplamalı konuşma hatlarının kullanılmasının enerji harcamasındaki etkisini inceleyerek, alternatif rotalardaki kazançları analitik tekniklerle karşılaştırdık. Bu sırada, çok hoplamalı konuşma hatlarının kullanılmasının her zaman enerji kazancını sağlamadığını gösterdik. Bunun yanı sıra, her hoplamada harcanan fazla enerjinin göz ardı edilmesi ve çıkış gücünün ayarlanabildiği verici devrelerinin öneminin gereğinden fazla önemsenmesi durumunda büyük enerji kayıplarının oluştuğunu gösterdik. Analitik sonuçları, değişik senaryolar üzerinde çalıştırdığımız benzetim yöntemleriyle karşılaştırdık. Daha sonra, büyük ölçekli algılayıcı ağlarındaki birden fazla merkez yerleştirme sorunlarını inceledik. Ağ üzerindeki enerji harcamalarını hesaplayabilmek için matematiksel bir formülasyon önerdik. Daha sonra, tasarım kriterlerine göre değişebilecek farklı sorunları listeledik. Son olarak, algılayıcı ağı için verilecek en az çalışma süresi kısıtını sağlayacak en az sayıda merkezin ağa yerleştirilmesi sorunu incelendi. Çözüm önerisi benzetim yöntemleriyle sınandı. vi TABLE OF CONTENTS ACKNOWLEDGEMENTS ................................................................................................. iii ABSTRACT..........................................................................................................................iv ÖZET......................................................................................................................................v LIST OF FIGURES................................................................................................................x LIST OF TABLES ..............................................................................................................xiv LIST OF SYMBOLS/ABBREVIATIONS ..........................................................................xv 1. INTRODUCTION............................................................................................................1 1.1. Contribution of the Thesis.......................................................................................2 1.2. Structure of the Thesis ............................................................................................3 2. WIRELESS SENSOR NETWORKS ...............................................................................5 2.1. Current Sensor Motes..............................................................................................5 2.2. Sample Scenarios ....................................................................................................7 2.3. Location Awareness ................................................................................................9 2.4. MAC Layer Interface ..............................................................................................9 2.5. Routing Technique................................................................................................11 2.6. Packet Structure ....................................................................................................12 2.7. Energy Model........................................................................................................14 2.7.1. Transmitter Power Model ..........................................................................15 2.7.2. Energy Consumption .................................................................................16 3. MULTIPLE SINK SENSOR NETWORK DEFINITIONS AND FORMULATIONS.19 3.1. Motivation.............................................................................................................19 3.2. Formulation of the Multiple Sink Network Design Problem................................20 3.2.1. Preliminaries ..............................................................................................20 3.2.2. Routing ......................................................................................................24 3.2.3. Path Length ................................................................................................26 3.2.4. Branch Nodes.............................................................................................28 3.2.5. Energy Dissipation.....................................................................................31 3.2.6. Counting the Packets .................................................................................33 3.2.7. Node Lifetime............................................................................................37 3.2.8. Investment Cost .........................................................................................38 vii 3.3. Summary ...............................................................................................................41 4. QUANTIFYING SAVED ENERGY BY MULTI-HOPPING ......................................42 4.1. Network Model .....................................................................................................43 4.1.1. Assumptions ..............................................................................................43 4.1.2. Multi-Hop Links ........................................................................................44 4.2. Energy Saving .......................................................................................................44 4.2.1. 1-D Communication Links ........................................................................45 4.2.2. Isosceles Triangular Communication Links ..............................................48 4.2.3. Arbitrary Triangular Communication Links..............................................50 4.2.4. Generalization............................................................................................51 4.3. Simulations on the Energy Savings by Multi-hopping .........................................52 4.3.1. Simulation Setup........................................................................................52 4.3.2. Results........................................................................................................54 4.4. Conclusions on the Energy Savings by Multi-hopping ........................................59 5. THE EFFECT OF OVERHEAD ENERGY TO THE NETWORK LIFETIME ...........61 5.1. Motivation for Overhead Energy Considerations .................................................61 5.2. Simulations on the Effect of Overhead Energy ....................................................63 5.2.1. Simulation Setup........................................................................................64 5.2.2. Results........................................................................................................65 5.3. Conclusions on the Effect of Overhead Energy....................................................68 6. MULTIPLE SINK SENSOR NETWORK DESIGN PROBLEM .................................70 6.1. Design Criteria ......................................................................................................70 6.1.1. Number of Sinks ........................................................................................70 6.1.2. Network Lifetime.......................................................................................71 6.1.3. Routing ......................................................................................................72 6.1.4. Cluster Members........................................................................................73 6.1.5. Location of Sinks.......................................................................................73 6.1.6. Data Generation Rate.................................................................................74 6.1.7. Energy Model ............................................................................................74 6.2. Routing Decisions .................................................................................................75 6.2.1. Minimum Energy Tree ..............................................................................75 6.2.2. Minimize the Maximum Energy Dissipation at Sensor Nodes .................76 6.2.3. Minimize the Maximum Energy Path........................................................77 viii 6.2.4. Maximum Residual Energy Path ...............................................................78 6.3. Redeployment Scenarios.......................................................................................78 6.3.1. Random Redeployment .............................................................................79 6.3.2. Neighborhood Redeployment....................................................................79 6.3.3. Replacement ..............................................................................................79 6.3.4. Redundant Deployment .............................................................................79 6.4. Sink Location Problems ........................................................................................80 6.4.1. Find the Best Sink Locations (BSL)..........................................................80 6.4.2. Minimize the Number of Sinks for a Predefined Minimum Operation Period (MSPOP) ........................................................................................80 6.4.3. Find the Minimum Number of Sinks while Maximizing the Network Life (MSMNL) ..........................................................................................81 6.5. Differences with Concentrator Location Problem ................................................82 6.6. A Solution Technique for the MSPOP Problem ...................................................83 6.6.1. Deployment of the Sensor Nodes ..............................................................83 6.6.2. Finding Location Information....................................................................84 6.6.3. Collecting the Location Information from the Field..................................84 6.6.4. Finding the Best Location for K Sink Nodes.............................................84 6.6.5. Estimating the Network Lifetime ..............................................................84 6.7. Computational Experiments on Multiple Sink Sensor Network Problems...........85 6.7.1. Simulation Setup........................................................................................85 6.7.2. Demonstrative Example for the BSL Problem ..........................................87 6.7.3. Application of the Solution Technique to the MSPOP Problem ...............93 6.7.4. Conclusion for the Computational Experiments........................................97 7. CONCLUSION AND FUTURE WORK.......................................................................99 7.1. Conclusion ............................................................................................................99 7.2. Future Work ........................................................................................................100 APPENDIX A: OPNET IMPLEMENTATION DETAILS ............................................101 A.1. Wireless Sensor Network....................................................................................101 A.2. Node Model.........................................................................................................102 A.3. Network Layer Process Model............................................................................104 A.4. Data Link Layer Process Model..........................................................................105 A.5. Packet Structure ..................................................................................................107 ix REFERENCES...................................................................................................................109 x LIST OF FIGURES Figure 2.1. Berkeley/Crossbow Mica motes compared with a US quarter (25 mm)..........5 Figure 2.2. Smart Dust Motes (5 mm) ................................................................................6 Figure 2.3. General architecture of a sensor node ..............................................................6 Figure 2.4. Data delivery from source to the sink using intermediate nodes....................12 Figure 2.5. Basic link layer packet structure.....................................................................12 Figure 3.1. A large-scale sensor network with three clusters ...........................................19 Figure 3.2. A path from the sensor i to the sink s through intermediate nodes j and k.....21 Figure 3.3. (a) A sensor network graph, (b) Corresponding minimum energy tree .........22 Figure 3.4. The set of relay nodes of the path Pi→s ..........................................................26 Figure 3.5. The set of branch nodes of the relay node j....................................................29 Figure 3.6. The packet generation interarrival times Z i(n ) for the initiator node i ............35 Figure 4.1. Radio transmission with different power levels result in different transmission range ..........................................................................................43 Figure 4.2. Using multi-hop links in routing decisions ....................................................44 Figure 4.3. Routing decision alternatives, (a) direct communication, (b) and (c) using an intermediate node ......................................................................................45 xi Figure 4.4. Energy saving in 1-D communication scenario .............................................46 Figure 4.5. Effect of α on energy saving in 1-D communication scenario.......................47 Figure 4.6. Energy saving in isosceles triangular communication scenario .....................49 Figure 4.7. Effect of α on energy saving in isosceles triangular communication scenario...........................................................................................................50 Figure 4.8. Arbitrary triangular communication scenario ................................................50 Figure 4.9. Energy saving in arbitrary triangular communication scenario .....................51 Figure 4.10. Generalization into a multi-hop path..............................................................52 Figure 4.11. Average hop count versus overhead energy τ (A = 200 m x 200 m, α = 3) .54 Figure 4.12. Average node energy versus overhead energy τ (A = 200 m x 200 m, α = 3) ...........................................................................55 Figure 4.13. Average node energy versus average hop count (A = 200 m x 200 m, α = 3, only Pcont nodes are used)..................................56 Figure 4.14. Average node energy versus overhead energy τ (A = 400 m x 400 m, α = 3) ...........................................................................56 Figure 4.15. Average hop count versus overhead energy τ (A = 200 m x 200 m, only Pcont nodes are used) ............................................57 Figure 4.16. Average node energy versus overhead energy τ (A = 200 m x 200 m, only Pcont nodes are used) ............................................58 xii Figure 4.17. Average hop count versus path loss exponent α (A = 200 m x 200 m, τ = 20 mJ, only Pcont nodes are used)...........................59 Figure 5.1. A sample network representing different topology alternatives for different path loss exponent α and overhead energy τ values........................62 Figure 5.2. Average packet delivery energy versus overhead energy ..............................66 Figure 5.3. Average node energy versus overhead energy ...............................................67 Figure 5.4. Average hop count versus overhead energy...................................................67 Figure 5.5. Network lifetime versus overhead energy......................................................68 Figure 6.1. System design algorithm ................................................................................83 Figure 6.2. Sample sensor network with 200 sensors and three sinks..............................88 Figure 6.3. Energy and disconnected region maps, until the 60th day ..............................90 Figure 6.4. Exhausted nodes versus time..........................................................................91 Figure 6.5. Unreachable nodes versus time ......................................................................91 Figure 6.6. Unreachable nodes versus time using rerouting.............................................92 Figure 6.7. Exhausted nodes versus time using rerouting ................................................92 Figure 6.8. Percentage of exhausted nodes versus time, with different number of sinks.94 Figure 6.9. Percentage of unreachable nodes versus time, with different number of sinks................................................................................................................94 xiii Figure 6.10. Comparison of random placement with k-means algorithm, with three sinks................................................................................................................95 Figure 6.11. Change in the number of sinks for different network lifetime requirements .96 Figure A.1. Sample wireless sensor network scenario ....................................................101 Figure A.2. Sensor node model .......................................................................................102 Figure A.3. Process diagram for the network layer .........................................................104 Figure A.4. Process diagram for the data link layer ........................................................106 xiv LIST OF TABLES Table 2.1. Optimal packet size in link layer ....................................................................13 Table 2.2. Length of binary BCH codes with different t.................................................14 Table 2.3. Path loss exponents for different environments .............................................16 Table 4.1. Simulation parameters ....................................................................................53 Table 5.1. Average energy dissipation at sensor nodes...................................................63 Table 6.1. Simulation parameters ....................................................................................86 Table 6.2. Expected network lifetime, with ρ = 0.25 .....................................................95 xv LIST OF SYMBOLS/ABBREVIATIONS min a jk Adjacency matrix of the minimum energy tree T A Set of arcs in the sensor network bj Branch size of the relay node j Bs Set of branch nodes of the sink node s B sj Set of branch nodes of the relay node j c Speed of light crD Cost of deployment action of the rth redeployment crN Cost of a sensor node at the rth redeployment csP Cost of placement of the sink node s csS Cost of the sink node s C Total investment cost CD Total cost of deployment action CN Total cost of sensor nodes CrN Total cost of sensor nodes at the rth redeployment CP Total cost of sink node placements CS Total cost of sink nodes d Euclidean distance dij Euclidean distance between two nodes having indexes i and j DS Budget dedicated for the total sink investment eij Energy cost of the arc (i, j ) ei→s Total energy dissipation for a data packet on the path Pi → s ej Relay energy load of the node j e j (t ) Total energy dissipation of the node j during the time interval (0,t]. ex Energy required for task x E j (t ) Residual energy of the node j at a given time t f Frequency G Directed graph representing the sensor network Gx Antenna gain Ks Service capacity of the sink s xvi l Length of a packet in bits lx Length of field x of a packet in bits li →s Path length of the path Pi → s niG (t ) Number of packets generated by the initiator node i n Rj (t ) Number of packets going through the relay node j nr Number of sensor nodes in the rth redeployment N Set of sensor nodes Nr Set of sensor nodes in the rth redeployment p Bit error rate of the radio channel p isjk Path matrix of the tree T Pi → s Path from a sensor node i to a sink node s min Pi → s Minimum energy path from the sensor node i to the sink node s min Px Power required for task x r Number of redeployments in a sensor network r jis Relay matrix of the tree T Ri → s Set of relay nodes of the path Pi → s S Set of sink nodes t Error correcting capabilities in binary BCH codes t Time T min min Minimum energy tree V All possible nodes in the network Vi→ s Vertex set of the path Pi → s min Vi→ s min Vertex set of the minimum energy path Pi → s X i (t ) Number of packets generated during the time interval (0,t] Z i(n ) nth interarrival time of packets α Path loss exponent δE Energy saving η Energy efficiency λ Wavelength of the signal µi Expected value of the interarrival time of packets ρ (t ) Sensor measurement reliability function xvii ρThreshold Predefined threshold for sensor measurement reliability τ Overhead energy ADC Analog to Digital Converter BCH Bose-Chaudhuri-Hocquenghem codes BSL Best Sink Locations CA “Consider” Algorithm CLP Concentrator Location Problem EAR Eavesdrop-And-Register FDMA Frequency Division Multiple Access FEC Forward Error Correction FSM Finite State Machine GPS Global Positioning System IA “Ignore” Algorithm IEEE Institute of Electrical and Electronics Engineers ISM Industrial, Scientific and Medical ISO International Standards Organization MAC Medium Access Control sublayer MEMS Micro Electrical Mechanical Systems MSMNL Minimization of the number of Sink nodes while Maximizing the Network Lifetime MSPOP Minimization of the number of Sink nodes for a Predefined minimum Operation Period OSI Open Systems Interconnection SMACS Self-organizing Medium Access Control for Sensor networks TDMA Time Division Multiple Access 1 1. INTRODUCTION Wireless sensor nodes are combining the wireless communication infrastructure with the sensing technology. Instead of transmitting the perceived data to the control center through wired links, ad hoc communication methods are utilized, and the data packets are transmitted using multi-hop connections [1, 2]. Through advances in Micro Electrical Mechanical Systems (MEMS) technology small, low-cost, low-power electronic devices coupled with sensing and wireless communication capabilities are constructed. These devices form a self-organizing ad hoc network to forward data packets towards sink nodes. There are several survey papers providing with in-depth background research on sensor networks [3-6]. The self-organization feature of sensors makes it feasible to deploy them randomly over the region being observed. Without needing a previous exploration, sensors might be installed to the environment in a random way, like dropping them from an aircraft. In this manner, a large number of sensor nodes are spread over the environment without having a prior knowledge of where each sensor is being placed individually. The most critical resource in the sensor network is the available energy of the sensor nodes. Whenever the sensor nodes are not coupled with some energy-scavenging tools, the only energy resource of them will be their installed battery, and the sensors with exhausted batteries cannot operate anymore. Moreover, since sensor nodes behave as relay nodes for data propagation of other sensors to sink nodes, network connectivity decreases gradually [7]. This may result in disconnected subnetworks of sensors, i.e., some portions of the network cannot be reachable at all. Therefore, the level of power consumption must be considered at each stage in wireless sensor network design. Sensor nodes have a short transmission range due to their limited radio capabilities. Therefore, the data must be relayed using intermediate nodes towards the sink. In addition, it may be more advantageous to use a multi-hop path to the sink node consisting of shorter links rather than using a single long connection. 2 _In some applications, several thousands of sensor nodes might be deployed over the _monitored region. For example, in agricultural scenarios, in environmental monitoring _applications, such large-scale sensor networks would be necessary. Moreover, the _diameter of the region might easily be several kilometers. In this case, scalability of the _network becomes a very important design issue. In order to obtain a scalable network, the _sensor nodes should be divided into clusters. The nodes within a cluster will then be _connected to the sink nodes dedicated for that cluster. Besides finding the best number of _sink nodes, their optimum placement within the field is also an important point. 1.1. Contribution of the Thesis In this thesis, we have introduced the “multiple sink sensor network design problem.” We have given a framework consisting of new formulations for the multiple sink sensor networks. Starting with the definitions of the sensor network, we have provided an infrastructure that is independent from the routing algorithm, which has been used within the derivations of the problem. Then, we have investigated the usage of an intermediate node forming multi-hop links, and its effect on energy gain. We have focused on uniformly deployed sensor nodes, each having identical communication capabilities. The sensor nodes are assumed to be able to adjust their transmission power. Therefore, each sensor consumes only the amount of energy that will suffice to reach for the transmitted radio waves to the destined receiver antenna. We have studied different multi-hop communication scenarios and calculated the energy saving in each scenario. We have also expanded these scenarios to general cases. The generalization can be applied into any arbitrary triangle and can be used in energy optimized route calculations. We also tried to quantify the effect of path loss exponent α, and overhead energy τ on energy saving. It is shown that the sensor lifetime can easily be doubled using power adjustable transmitter circuitry. Thereafter, we have shown that neglecting the overhead energy during routing decisions could result in suboptimal energy usage. The effect of overhead energy is usually ignored in traditional ad hoc networks, where the transmission energy is much higher than the overhead energy. However, in sensor networks, due to short communication ranges, we 3 have to include the overhead energy to the overall energy cost in the routing calculations. We have investigated the use of multi-hop communication links in routing and compared the amount of energy gain acquired by correct energy calculations. We show that neglecting the overhead energy and overemphasizing the importance of power adjustable transmitter circuitry could result in considerable energy loss. Finally, we have stated characteristic features of the multiple sink location problems in large-scale wireless sensor networks. Several design issues including different design criteria, routing alternatives and redeployment scenarios are presented. The effect of locating sink nodes on the sensor environment regarding the total network lifetime is analyzed. The predefined constraints stating the minimum required operational time for the sensor network is incorporated with the design problems. Solution techniques that are finding the best sink locations and the quantity of the sink nodes are presented. Using demonstrative examples and simulations, these solution techniques are evaluated. 1.2. Structure of the Thesis In the next chapter, we give a brief introduction on wireless sensor networks. We introduce first the sensor devices, how they physically constructed. After that, we present the underlying network architecture, and the energy model. In Chapter 3, multiple sink sensor network framework is introduced. A mathematical formulation of the sensor networks are given, which is later used to represent the routing tree, to define the communication paths and relay sets, moreover, to calculate the overall energy dissipation in the network. In Chapter 4, we provide a formulation to quantify energy saving using multi-hop communication links. The results are compared with simulations. We show that multi-hopping is not always advantageous, and formulate whenever to use multi-hopping. In Chapter 5, we analyze the effect of overhead energy to the network lifetime. We show that neglecting the overhead energy during routing calculations could result in suboptimal routing trees, which causes higher energy dissipation at sensor nodes. 4 In Chapter 6, we introduce the multiple sink sensor network design problem. Several design criteria and objectives are presented. The effect of routing decisions and redeployment scenarios of sensor nodes are stated. Together with the definitions of sink location problems, a solution technique is also given. After that, computational experiments are presented. The energy map of the network and the map of unreachable region through its lifetime are presented within the simulation results. Finally, we conclude the thesis, and provide some future research directions. 5 2. WIRELESS SENSOR NETWORKS Industrial sensors are responsible to perceive a physical phenomenon in the environment. Thereafter, the data gathered through the sensors has to be forwarded to a control center for further processing. Instead of transmitting this data through wired links, wireless sensor nodes employ wireless communication technologies for data propagation. Advances in technology enabled construction of small, low-cost, low-power electronic devices coupled with sensing and wireless communication capabilities. These sensor elements can easily build a self-organizing network for information propagation [1, 2]. There are several survey papers providing with in-depth background research on sensor networks [3-6]. In this chapter, application specific design issues are discussed. Besides the sample scenarios that this work could be applied, network specific technical details are also mentioned. 2.1. Current Sensor Motes Recent advances in MEMS technology enabled small sized electronic devices coupled with sensor and communication equipment. The main focus on this production cycle is to achieve very low-cost devices. Figure 2.1. Berkeley/Crossbow Mica motes compared with a US quarter (25 mm) [8] 6 The Berkeley/Crossbow Mica Motes (see Figure 2.1) has a size of a US quarter (25 mm) coupled with a multi-channel transceiver, on-board temperature sensor, and a processing unit [8]. The transceiver is capable to work on 898/916 MHz or 433 MHz Industrial, Scientific and Medical (ISM) bands where the radio power is programmable. Figure 2.2. Smart Dust Motes (5 mm) [9] Another successful implementation is the Smart Dust Motes (see Figure 2.2). These devices are communicating using laser beams, and are imagined to become one cubic millimeter of size [9]. Location Finding System Sensing Unit Sensor ADC Mobilizer Processing Unit Processor Storage Power Unit Transceiver Power Generator Figure 2.3. General architecture of a sensor node, redrawn from [6] The general architecture of a sensor node is shown in Figure 2.3 (redrawn from [6]). The major components are sensing unit, processing unit, transceiver, and power unit. The environmental information is retrieved using the sensor and converted with an analog to digital converter (ADC) to digital data. This data is forwarded to the processing unit to 7 become a data packet that is to be sent to the sink node for further examination. The communication between the sensor nodes are carried out with the transceiver. The power unit feeds all these components with the necessary operational power. The optional units, such as the location finding system, mobilizer and power generator may be embedded to the node depending on the application. Most of the applications require some location information for the sensed data when they reach the sink node. Mobility might also be an application-specific requirement. Although most monitoring applications utilize only static sensor nodes, for some tracking scenarios mobility might be a major design criterion. Finally, in order to prolong the lifetime of a sensor node, a power scavenging tool such as solar cells can be attached to the node. 2.2. Sample Scenarios Wireless sensor networks have many application areas mentioned in the literature. A detailed list can be found in [6]. Moreover, some applications require a more detailed analysis, since there might be some application specific constraints to be considered. The self-organization feature of sensors makes it feasible to deploy them randomly over the region being observed. Without needing a previous exploration, sensors might be installed to the environment in a random way, like dropping them from an aircraft. In this manner, a large number of sensor nodes are spread over the environment without having a prior knowledge of where each sensor is being placed individually. These sensors are assumed to be distributed uniformly in the environment. Two other deployment strategies are mentioned in [10]. The sensors may be regularly placed with some geometric topology depending on the application, e.g., a grid. They can also be placed with a prior knowledge of the phenomenon to be observed, resulting in a biased installation. In places where the phenomenon is more likely to occur or appears more densely, a higher amount of sensors might be necessary for an investigation that is more precise. In order to reduce the cost of deployment, a path exposure method is proposed in [11]. Having deployed the sensors in the environment, they start to observe the phenomenon. Data from the sensors might be gathered in different ways. First, the sensors 8 might continuously send reports to the sinks with an application-dependent predefined interval. Second, they might be polled by the control unit. In this case, all the sensors might be under consideration or only a small portion lying on the suspected region might be queried as well. In the first case, the query is spread using broadcast methods to the network, whereas in the latter case multicast communication techniques must be employed to save resources. Third, sensors can decide to send data when they observe a specific event [12, 13]. In many environmental applications like forest fire detection, soil erosion monitoring, air pollution measurements, or monitoring the saltiness level of the field, sensors are distributed randomly in the considered environment. Due to the extreme size of the area and application’s complete coverage needs, a very large amount of sensors must be deployed. In this case, scalability becomes a crucial issue. Therefore, the complete sensor network should be divided into clusters to achieve a more stable system [14, 15], [16]. In this manner, not only the system will be easier to manage, but also the total network lifetime will increase resulting in a more economical investment. Several biomedical applications can also make use of wireless sensor nodes through incorporation of sensing materials with wireless communication circuitry, such as a glucose level monitor or retina prosthesis [17]. When we consider wireless networking of human-embedded smart sensor arrays, the design constraints are very different. The solutions should be ultra-safe and reliable, work trouble-free in different geographical locations, and require minimal maintenance. Another interesting application is habitat monitoring. In [ 18 ], seabird nesting environment is monitored. This experiment is accomplished using 32 sensor nodes on a small island streaming live data onto the web. In [19], the concept of “smart kindergarten” is introduced where developmental problemsolving environments for early childhood education are incorporated with wireless sensor networks. Here, sensor-enhanced toys and classroom objects are connected with back-end middleware services and database techniques. 9 2.3. Location Awareness Since sensor nodes are spread randomly over the field, they initially do not know their exact locations. Many applications, however, require location information to achieve the desired functionality. Extracting location information from a Global Positioning System (GPS) module attached to the sensor is not a feasible solution [20, 21]. First, these devices are physically large and energy sensitive. Second, in many applications, sensors antennae cannot be in line-of-sight of the satellites. In addition, they are still very expensive devices, producing a costly solution for location estimation. Although GPS cannot be a solution for location estimation problem, current research on this topic provided good alternatives. In [22], a centralized method is proposed. Using convex position constraints, which have been derived from the connectivity information, the position estimation is performed relative to nodes, whose locations information are known a priori. In [21], a radio frequency technique is used to estimate location. Each beacon periodically signals overlapping location information to the network. Depending on the connectivity metric, nodes localize themselves to the centroid of their nearby beacons. A similar technique is given in [23], where a collaborative multi-lateration technique is presented. Using this method, ad-hoc deployed sensor nodes can estimate their locations by using beacon locations that are several hops away and distance measurements to neighboring nodes. A different approach in [24] is based on an angle of arrival estimation technique. In this work, beacon nodes are equipped with a directional antenna, using which they can send directional beacon signals that are powerful enough to be heard by all sensor nodes. 2.4. MAC Layer Interface For a careful design of wireless sensor networks, one should consider an appropriate MAC layer optimized for sensor communication. One should always consider that sensor 10 nodes are low-power devices, and they do not contain a strong computational unit. Therefore, MAC layers designed for traditional ad hoc networks cannot be applied to wireless sensor networks. Several MAC layer alternatives are proposed in the literature. For a detailed list, the reader could refer to [6]. Self-organizing Medium Access Control for Sensor networks (SMACS) is an infrastructure building protocol that forms a flat topology for sensor networks [25]. This is a distributed protocol that enables a collection of nodes to discover their neighbors and establish transmission/reception schedules for communicating with them without the need for any local or global master nodes. To reduce the likelihood of collisions, it requires each link to operate on a different frequency. This frequency band is chosen at random from a large pool of possible choices when the links are formed. In order to provide continuous service to mobile sensor nodes, Eavesdrop-AndRegister (EAR) algorithm is proposed [25]. This algorithm enables seamless interconnection of mobile nodes in the field of stationary wireless nodes, and represents the mobility management aspect of the SMACS protocol. In [26], time division multiple access (TDMA) and frequency division multiple access (FDMA) schemes are discussed. In TDMA, the transmission time is minimized, as the full bandwidth of the channel is allocated for a single sensor node. However, in this case, only one sensor can be actively transmitting. In order to enable simultaneous transmissions, FDMA scheme can be used where the bandwidth is divided into frequencies, which are assigned to different sensors. In this case, the transmission time is maximized. A hybrid scheme involving both TDMA and FDMA is also introduced. This thesis is independent of the MAC layer. The sink location and related clustering mechanism can be applied into any MAC layer that the sensor’s hardware is employing. Therefore, MAC layer is not considered as a fundamental part of the solution. On the contrary, the solution can be used with any MAC layer that will be found on the market. 11 2.5. Routing Technique In order to utilize the sensor’s energy in the most beneficial manner, power-aware routing methods must be used. Since these equipments are limited on battery resources, the underlying routing protocol should pay attention to the power level of each sensor in the network. Data that is extracted from the environment should be forwarded to the sink nodes for further processing. During this phase, sensors constitute an ad hoc network infrastructure and data packets are routed to the sink node through intermediate nodes. Each node generates a small data packet containing the knowledge gathered from the environment. This data packet is sent to the destination using the underlying routing method with the help of intermediate sensor nodes. Intermediate nodes have several alternatives. These alternatives are application dependent and may be chosen according to user needs. (i) They can directly forward the packet to the next relay node or to the destination, if it was the last hop on the way to the destination. (ii) They can delay the forwarding for a moment waiting for other sensors, which might as well be generating a packet sent to the same destination, so that all these packets can be merged into one larger packet. (iii) Similar to (ii), but this time the data in each packet might be extracted and aggregated into a new result, and this result is forwarded to the destination. (iv) An intermediate node can also add its own measurements to the packets, using methods described in either (ii) or (iii). In Figure 2.4, sensor nodes i1 and i2 transmit data packets simultaneously. Their packets are routed to the sink node s through intermediate nodes. The underlying routing method may choose to merge the data packets into one packet on the way to the destination at the intermediate nodes. All the other nodes in the environment may stay idle during this communication. 12 i1 idle node i2 s intermediate node initiator node sink Figure 2.4. Data delivery from source to the sink using intermediate nodes In this routing mechanism, intermediate nodes that have enough residual power should be used as relays. The choice of intermediate nodes can be performed in a distributed manner at the node level, or centrally at the destination. In the latter case, a global knowledge of node status information is assumed. This data is not unrealistic to be captured. Sensor nodes are sending their measurements to the destinations. Supplementary information like their geographic location and battery level may be piggybacked to their data packets. As a result, the destination nodes may retrieve all the necessary information about the current network infrastructure and remaining resources from the field. Furthermore, since these nodes are more powerful in the sense computational power and battery resources, they can perform extensive computations like centralized routing decisions easily. 2.6. Packet Structure Data packets need to be carefully designed to carry the information gathered from the environment. Packets are originated from source sensors and are sent to intermediate nodes in order to be forwarded to the destination. In the previous section, alternative routing, merging and aggregating mechanisms are stated. Beside their effect on routing, these requirements also affect the underlying packet structure. Header Payload Trailer lh lp lt Figure 2.5. Basic link layer packet structure, redrawn from [27] 13 The basic link layer packet structure is given in Figure 2.5, which is presented in [27]. Here, the packet is composed of header, payload and trailer parts, which are assumed to be of lh, lp and lt bits long respectively. The header field contains segment information corresponding to higher layer packets and source and destination identifiers. Whenever the application does not require the exact node identifier, a collection of event, location, and attribute identifiers could also easily replace the header information, resulting with a much shorter field of a few bytes. The payload contains information bits and the trailer part contains error control bits. The size of the payload depends on the information that the packet contains. The data gathered from the phenomenon should be sent to the destination. For temperature, humidity or attribute sensors, only one or two bytes will be sufficient to code the information. Depending on the alternative routing and aggregating mechanism, this data will be replicated for each intermediate sensor in the routing tree. For centralized power-aware routing methods, current battery level of the sensor should be sent to the destination nodes. Furthermore, again depending on the alternative routing and aggregating mechanism, this battery information contains data for each sensor in the routing tree. This information is extracted at the destination and used in route calculations. Table 2.1. Optimal packet size in link layer [27] FEC Method η Without FEC 0.70 100 500 BCH, t = 2 0.88 400 800 BCH, t = 4 0.93 1000 1500 BCH, t = 6 0.95 1500 3000 Min max In [27], a detailed analysis is presented to estimate the optimum payload size considering energy efficiency (η). The payload size is found to lie between 50 and 500 14 bytes depending on the bit error rate of the channel when no error control mechanism is used. This size increases up to a minimum of 500 and maximum of 3000 bytes according to the error correcting capability that is employed. Here, binary BCH codes are used with different error correcting capabilities (t), i.e., the maximum number of bit errors that can be corrected seamlessly. Approximate results for raw bit error rate p = 10-3 are summarized in Table 2.1. Table 2.2. Length of binary BCH codes with different t [28] t Total packet size BCH code length Data size 2 63 12 51 2 255 16 239 2 511 18 493 2 1023 20 1003 4 63 24 39 4 255 32 223 4 511 36 475 4 1023 40 983 6 63 33 30 6 255 48 207 6 511 54 457 6 1023 60 963 Using BCH codes, however, adds extra error correcting bits to the data packets. A designer should consider this overhead during estimating the necessary packet size. Examples for these overhead-bits are given in Table 2.2. For a detailed description of binary BCH codes, the reader may refer to [28]. 2.7. Energy Model Efficient energy consumption is one of the most important design constraints in wireless sensor network architecture [29]. The life of each sensor node depends on its 15 power dissipation. In applications where the sensors are not equipped with energy scavenging tools like solar cells, sensors with exhausted batteries cannot operate anymore. Moreover, since sensor nodes behave as relay nodes for data propagation of other sensors to sink nodes, network connectivity decreases gradually [7]. This may result in disconnected subnetworks of sensors, i.e., some portions of the network cannot be reachable at all. Therefore, the level of power consumption must be considered at each stage in wireless sensor network design. 2.7.1. Transmitter Power Model As mentioned before, the main concern in wireless sensor network design is power. The underlying architecture must consider power efficiency as a major constraint. A good evaluation of the available techniques can be found in [30]. To start, consider the radio propagation model in a single-path free-space channel. The relationship between transmitted power Pt and received power Pr is given by Pr λ = Gt Gr Pt 4πd 2 (2.1) where Gt and Gr are the transmitter and receiver antenna gains respectively, d is the distance between the transmitter and receiver, λ = c f is the wavelength of the transmitted signal, whereas f is its frequency, and c is the velocity of radio wave propagation in free space, which is equal to the speed of light. Using Equation 2.1, we derive Pt = ωd 2 (2.2) where ω = (Pr Gt Gr )(4π λ )2 . Equation 2.2 can be further generalized as Pt = ωd α (2.3) 16 where α > 1 is known as path loss exponent. For free-space channel, we have seen in Equation 2.2 that α = 2 . Table 2.3 gives a list of typical path loss exponent values obtained in various radio environments [31]. In many sensor applications, it is assumed that α ranges between 2 and 4, since the sensors have short antennae, which are very close to the ground. Table 2.3. Path loss exponents for different environments [31] Environment α Free space 2 Urban area cellular radio Shadowed urban cellular radio In building line-of-sight 2.7 to 3.5 3 to 5 1.6 to 1.8 Obstructed in building 4 to 6 Obstructed in factories 2 to 3 Power is defined by the rate of change in the energy with time [32]. Therefore, the amount of energy that is necessary to operate for time t consuming power P can be found as follows. ∆E = P ∆t (2.4) 2.7.2. Energy Consumption Energy consumption in an arbitrary sensor node has in general the following components depending on the operations performed within the node: (i) Sensing Energy: In order to activate sensing circuitry within the node, and gathering data from the environment, an amount of energy must be dissipated, which is called sensing energy, eS. The magnitude of this energy depends on the task that is assigned to the sensor. Different sensors require different level of energy during operation. 17 (ii) Transmitter Energy: Afterwards, this data must be transmitted towards the destination. Therefore, the transmitter circuitry must be operated. For this operation, the transmitter energy, eT must be consumed which depends on the transmitter power, Pt, size of the data packet, and the data transfer rate. (iii) Receiver Energy: As a relay node, a sensor node is also in charge of forwarding data packets of other sensor nodes. For this operation, sensors must be able to receive those data packets. The receiver energy, eR, will be consumed during this operation, which is irrelevant of the distance between nodes. During reception, receiver power, Pr, will be spent during the reception of the data packet with the given data transfer rate. (iv) Computation Energy: To operate these circuitries, sensor’s processing unit must be activated. Moreover, whenever data aggregation is performed additional computations must be realized. Compared to the previous items, computation energy, eC, is relatively low [33]. During the life cycle of a typical sensor node, each event or query will be followed by a sensing operation, performing necessary calculations to derive a data packet and transmitting this packet to the destination. In addition, sensor nodes often relay data packets received from other sensors. Thus, the total energy, eTotal, in an arbitrary active time frame can be presented as the sum of above energy requirements. eTotal = eT + eR + eS + eC (2.5) Efficient sensing circuitries and computation algorithms help to reduce eS, and eC. The other two components eT, and eR are dependent on the communication architecture and underlying techniques. Therefore, power aware methods must be employed in order to reduce the energy consumption during communication [33]. Only the transmitter energy, eT, is related with the distance between the communicating sensor nodes. The other components of total energy remain constant with varying distance between communicating pairs. Therefore, we can rewrite Equation 2.5 as a function of d using Equation 2.3 and Equation 2.4 as follows. 18 eTotal (d ) = κd α + τ (2.6) where κ = ω∆t , with ∆t being the duration of packet transmission process, and τ = eR + eS + eC , the overhead energy, which is a constant value with varying d. Any other energy consuming activity in the sensor node can be added to the overhead energy component that do not depend on the transmission distance [34]. A similar energy model is proposed in [35] where the energy consumption for a message is measured as d α + τ , with a comparable argumentation. However, the important factor κ was missing. 19 3. MULTIPLE SINK SENSOR NETWORK DEFINITIONS AND FORMULATIONS 3.1. Motivation The efficiency of the sensor network investment is directly related with the length of the reliable monitoring duration of the field. The better energy control mechanisms are used in the sensor nodes’ firmware and in the network management techniques, the longer the network will be serving their investors. Therefore, the limited battery resource of the sensors should be handled efficiently. In some applications, several thousands of sensor nodes might be deployed over the monitored region. For example, in agricultural scenarios, in environmental monitoring applications, such large-scale sensor networks would be necessary. The diameter of the region might easily be as large as several kilometers. In this case, scalability of the network is a very important design issue. In order to obtain a scalable network, the sensor nodes should be divided into clusters. The nodes within a cluster will then be connected to the sink nodes dedicated for that cluster. Figure 3.1 shows such a sensor network with several nodes and three clusters with three sink nodes. Figure 3.1. A large-scale sensor network with three clusters 20 During the design phase of a large-scale sensor network, the designer should decide on the number of clusters, and more important than that, the optimum locations of the sink nodes. We call this problem as the “multiple sink sensor network design problem” and try to provide some solutions. In the following sections, we give a new formulation for the multiple sink sensor networks. Starting with the definitions of the sensor network, we will provide an infrastructure that is independent from the routing algorithm, which is going to be used within the derivations of the problem. Thereafter, energy dissipation formulations are derived, in order to calculate the node lifetime. Finally, the investment cost formulations are presented. 3.2. Formulation of the Multiple Sink Network Design Problem The wireless sensor network consists of several sensor nodes and one or more sink nodes, each of which is connected through wireless links to other nodes. In this section, we try to derive formulations to quantify the lifetime of the sensor network, which will later be used as design objectives. In the following paragraphs, we formalize the sensor network using graph theoretical viewpoint where the basic definitions can be found in [36, 37]. 3.2.1. Preliminaries Definition 3.1: Let N = { sensor nodes }, the set of sensor nodes in the wireless sensor network, and S = { sink nodes }, the set of sink nodes. Then let V = N ∪ S denote all possible nodes in the network. Let G = (V, A) be a directed graph representing the sensor network. In this graph, the vertex set V stands for the nodes, and the arc set A stands for valid communication links. Let (i, j ) ∈ A denote arcs, where i, j ∈ V . Let dij denote the Euclidean distance between nodes i, j ∈ V . If we assume that the radio transmitters of the nodes have enough transmission power, where Pt → ∞ , then the radio signals of each node can reach to every other node in the network, resulting in a fully connected graph. In the real world, however, there is a 21 physical limit for the maximum transmission power, with Pt ≤ Pmax . Therefore, we cannot expect G being fully connected. On the contrary, there might be some disconnected nodes, whose radio signals cannot reach to any other node in the network. If we exclude these disconnected nodes from the vertex set, we obtain a new vertex set V ′ ⊆ V , where G' = (V', A) forms a connected graph. Since our aim is successfully managing the connected nodes in the network, without loss of generality, we can assume that the graph G is connected. Definition 3.2: A path from a sensor node i0 ∈ N to a sink node s ∈ S is a non-empty ( ) subgraph Pi0 →s of G, where Pi0 → s = Vi0 → s , Ai0 → s , Vi0 →s = {i0 , i1 ,..., in , s} , i0 , i1 ,..., in ∈ N , Ai0 → s = {(i0 , i1 ), (i1 , i 2 ),..., (i n −1 , i n ), (in , s )} ⊆ A . The node i0 ∈ N is called as the initiator node, and the nodes i1 , i2 ,..., in ∈ N are called as intermediate nodes or relay nodes. After the deployment phase, the sink nodes start to collect information from the sensor nodes. This data flow is performed through communication paths from sensor nodes towards the sink nodes. Pi→s represents these data flow paths in the network. Figure 3.2 shows such a path where Vi → s = {i,..., j , k ,..., s}. i j k s Figure 3.2. A path from the sensor i to the sink s through intermediate nodes j and k Definition 3.3: The energy cost of an arc (i, j ) ∈ A , eij is defined to be a real-valued function e : A → ℜ . The energy cost of a path Pi→s from a sensor node i ∈ N to a sink node s ∈ S is given by e(Pi→s ) = ∑e jk ( j ,k )∈Ai → s . 22 Using the energy cost function as the metric, energy aware routing algorithms might calculate the minimum energy paths in the network, in order to achieve the maximum energy saving. In other words, each sensor node is going to deliver its data packets through a minimum energy path to a sink node. In our energy model, we use the energy cost function (see Section 2.7.2) α eij = κd ij + τ (3.1) where α is the path loss exponent, and κ ,τ ∈ ℜ are real numbers. In many sensor applications, it is assumed that α ranges between 2 and 4, since the sensors have short antennae, which are very close to the ground. min Definition 3.4: The minimum energy path Pi→ s from a sensor node i ∈ N to a sink node min s ∈ S is a path with the vertex set Vi→ s , where e(Pi → s ) is a minimum, i.e., min Pi → s = arg min{e(Pi → s )} . Pi → s The minimum energy tree T min = (V', A') is a subgraph of G = (V, A), V ′ ⊆ V , A′ ⊆ A , where { } min T min = s min Pi→ s . i∈N (a) s∈S (b) Figure 3.3. (a) A sensor network graph, (b) Corresponding minimum energy tree The minimum energy tree is a collection of all the minimum energy paths from the sensor nodes to their corresponding sink nodes. Here, the sensor nodes are matched with the sink nodes according to the energy cost measurement of the path connecting them. Figure 3.3 (a) shows an example sensor network as a graph, where every communication 23 link is drawn. In Figure 3.3 (b), the corresponding minimum energy tree for each sink is shown. Lemma 3.1: Let T min = (V', A') be the minimum energy tree of G = (V, A), then N ⊆ V ′ . Proof: The lemma follows directly from the definition of the minimum energy tree T min . Since for min min , we have i ∈ V ′ . all i ∈ N , there exists a sink s ∈ S , where we have a Pi→ s ⊆T This lemma is important as it states that all sensor nodes are included in the minimum energy tree. In other words, every sensor node is connected to a sink node by a minimum energy path. The statement is, however, not true for the sink nodes. Not every sink node should somehow be used in the minimum energy tree. As a counter example, assume a sink node lying at the border of the network, where it can be reached by a sensor node, but there exists a better alternative for this node to communicate. Then, this sensor node will not be connected to that sink node in the minimum energy tree. Definition 3.5: Let n = N . The adjacency matrix M A = (a jk )n×v of the minimum energy tree T min = (V', A') is defined by 1 if ( j , k ) ∈ A′ a jk = 0 otherwise. min Lemma 3.2: Let Pi→ s be the minimum energy path from a sensor node i ∈ N to a sink min min node s ∈ S . Then for all j ∈Vi→ s , there exists k ∈ Vi → s such that a jk = 1 and a jx = 0 for all x ∈ V − {k }. Proof: Let T min min min min min ′ = (V', E'), and assume Pi → s = (Vi → s , Ai → s ), where Vi → s = {i , u ,..., j , k ,..., v, s} ⊆ V ′ and Aimin → s = {(i, u ),..., ( j , k ),..., (v, s )} ⊆ A . Then from the definition of the adjacency matrix, min min a jk = 1 . Since Pi→ s is a path from i to s, a jx = 0 for all x ∈ Vi → s − {k } . It remains to prove 24 min that a jx = 0 for all x ∈ V − Vi→ s − {k } . But, this is clear from the definition of the minimum min energy path. If there existed x ∈ V − Vi→ s − {k } such that a jx = 1 , then this would mean that e jx ≤ e jk , which is a contradiction. Hence, the lemma is true. Theorem 3.1: For all j ∈ Vi → s , where i ∈ N and s ∈ S , we have ∑a jx = 1. x∈V Proof: min For all j ∈ Vi → s , the result follows directly from the definition of the path Pi → s , and min Lemma 3.2. There exists k ∈ Vi→ s such that ∑a x∈V jx = a jk + ∑a x∈V −{k } jx = 1 + 0 = 1 . For all other min paths between i and s, where Pi→ s ≠ Pi → s , we know from Lemma 3.1 that there exists a sensor node n ∈ N and a sink node t ∈ S , where j lies on the minimum energy path Pnmin →t . Thus, there exists k ∈ Vnmin →t such that a jk = 1 , and the result follows. This theorem states through the adjacency matrix that each sensor node is connected to one and only one other sensor node towards a sink node. This conclusion is trivial from the definition of the tree, but the formula is necessary in the following formulations. 3.2.2. Routing The energy dissipation is directly related with the underlying routing technique. Depending on the application requirements, different routing strategies might be implemented on the same network. The routes might form tree structures where the roots are the sink nodes, or similarly, multi-path routing strategies could also be employed when the communication links are less reliable, while generating multiple copies of the same data packet. In order to handle any type of routing alternative, we represent the routing decisions using path matrices in a generalized manner, where the results of any routing algorithm easily can be applied. 25 ( ) Definition 3.6: Let n = N , m = S , v = V = n + m . The path matrix M P = p isjk the tree T min n× m×n×v of is defined by 1 if ( j , k ) ∈ Aimin →s , i ∈ N , s ∈ S p isjk = 0 otherwise. The path matrix shows the results of the underlying routing algorithm. The elements of the form p isjk should be read as follows: If we consider a path from a sensor node i to a sink node s, and if the link connecting the nodes j and k lies on that path, then the value of the element is 1, otherwise 0. In multi-path routing algorithms, the binary values of the matrix showing the presence of the link could easily be extended to show the probability for a packet to choose that link, where p isjk = P{( j , k ) ∈ Ai→ s i ∈ N , s ∈ S } . In this work, we focus on tree algorithms, and therefore the path matrix represents the connections for the minimum energy tree T min . This matrix is later used for network-wide energy calculations. There is a close relation between the definitions of the adjacency matrix and the path matrix. The following lemma states this relationship. Lemma 3.3: Let T min = (V', A') be the minimum energy tree in a sensor network G = (V, A). For any i ∈ N , and s ∈ S , we have Pi→s ⊆ T min ⇔ p isjk = a jk , for all j , k ∈Vi→s . Proof: The result is achieved using Lemma 3.2 and the definitions of the matrices. (i) Pi→s ⊆ T min ⇒ j , k ∈Vi→s ⇒ p isjk = 1 ⇒ ( j , k ) ∈ Ai→s ⊆ A′ ⇒ ( j , k ) ∈ A′ ⇒ a jk = 1 . (ii) a jk = 1 ⇒ ( j , k ) ∈ A′ ⇒ There exists a path Pi → s such that Ai→ s ⊆ A′ and ( j , k ) ∈ Ai→ s ⇒ p isjk = 1 ⇒ Pi→s ⊆ T min . 26 3.2.3. Path Length Next, we try to calculate the number of sensor nodes on the path that are connecting a sensor node to a sink node. This value can be used for calculating the average energy dissipation for an arbitrary path in the network. Definition 3.7: Let i ∈ N , s ∈ S . The set of relay nodes or the relay set of the path Pi→s is defined as Ri→ s = { j ∈ N : j ∈ Vi →s − {s}} , or equivalently Ri →s = Vi→ s − {s} ⊆ N . In order to visualize this set, consider Figure 3.4. Here, the path from the sensor node i to a sink node s is shown, where Vi → s = {i, j ,..., k , s}. Then we have Ri → s = {i, j ,..., k }. Including the initiator sensor node to the set of relay nodes may be confusing, and the set could be redefined by excluding the initiator node from the set. In our work, however, we include this node to the set. The reason is that even this node has to “relay” its own data through the network layer to the next-hop node. Ri→s i j k s Figure 3.4. The set of relay nodes of the path Pi→s Definition 3.8: The path length of the path Pi→s is defined as l i →s = Ri → s . In order to calculate the number of sensor nodes in the path, we require one final definition, the relay nodes’ matrix. 27 ( ) Definition 3.9: Let n = N , m = S , v = V = n + m . The relay matrix M R = r jis the tree T min n×m× n of is defined by 1 if j ∈ Ri → s , i ∈ N , s ∈ S r jis = 0 otherwise. An element of the relay matrix of the minimum energy tree is set to 1, if and only if that node is included in the minimum energy path from a sensor node i to a sink node s. min Lemma 3.4: Let Pi→ s be the minimum energy path from a sensor node i ∈ N to a sink node s ∈ S . Then, we have li→s = ∑r is j . j∈N Proof: For all j ∈ Ri→ s , the definition of the relay matrix states, r jis = 1 . Therefore, ∑r is j j∈Ri → s ∑r = Ri → s = l i → s . Similarly, for all j ∈ N − Ri → s , we have r jis = 0 , where is j = 0 . Therefore, ∑r is j j∈N j∈N − Ri → s = ∑ j∈Ri → s r jis + ∑r is j j∈N − Ri → s = l i → s + 0 = li → s . Theorem 3.2: Let j ∈ N be an arbitrary node in the sensor network, and i ∈ N , s ∈ S . ( ) For the relay matrix M R = r jis of the tree T min , we have r jis = ∑ p isjk . k∈V Proof: min min min Let Ri→ s denote the relay set of the path Pi→ s = ( Vi → s , Ai → s ) . (i) If j ∈ Ri→ s , we know from Lemma 3.3 that p isjk = a jk , for all k ∈ V . Therefore we have, ∑p k∈V is jk = ∑ a jk . And from Theorem 3.1, we have k∈V Hence, we have ∑p k∈V is jk = 1 = r jis , for all j ∈ Ri → s . ∑a k∈V jk = 1. 28 is (ii) If j ∈ N − Ri → s , we have ( j , k ) ∉ Aimin → s , for all k ∈ V . Therefore p jk = 0 , for all k ∈ V , and hence, ∑p k∈V is jk = 0 = r jis , for all j ∈ N − Ri → s . Combining the results (i) and (ii), we have ∑p is jk = r jis , for all j ∈ N . k∈V min Corollary 3.1: Let Pi→ s be the minimum energy path from a sensor node i ∈ N to a sink node s ∈ S . Then, we have li→s = ∑∑ p isjk . j∈N k∈V Proof: From Lemma 3.4, and from Theorem 3.2, the result follows. li→s = ∑ r jis = ∑ ∑ p isjk = ∑∑ p isjk j∈N j∈N k∈V j∈N k∈V The result of this corollary can be used to calculate the length of each path in the minimum energy tree. Depending on the design criteria of the underlying routing algorithm, we may have different path matrices. However, the results here can be applied on any routing tree. 3.2.4. Branch Nodes The energy dissipated by an intermediate node depends on the number of nodes that are connected to the sink through itself. The packets of initiator nodes are forwarded towards sink nodes through intermediate nodes. The nodes that are close to the sink nodes carry a higher load. When the batteries of such a critical node run out of energy, then the whole branch that is connected through this node may become unreachable. Although there are techniques to reconstruct the tree to connect this branch to the sink [38], the routing tree should consider balancing this relaying load throughout the network, in order to maximize the total lifetime of the network. 29 Definition 3.10: Let s ∈ S . The set of branch nodes or the branch set of a relay node j ∈ N is defined as B sj = { i ∈ N : j ∈ R }. i→s Similarly, the set of branch nodes or the branch set of a sink node s ∈ S is defined as Bs = tB s j . j∈N The branch set of a relay node includes all nodes in the routing tree, which are connected to a sink node through a path that is passing over this relay node. In Figure 3.5, an arbitrary relay node j is given. This node is an element of the path connecting the initiator nodes {i0 , i1 ,..., in } to the sink node s. B sj i0 i1 j k s in Figure 3.5. The set of branch nodes of the relay node j Because of the definition of the relay set, we assume that j also relays its own packets to the sink. Therefore, j is an element of its branch set. If an initiator node reaches the relay node j through a multi-hop link, say through another intermediate node k, then from the definition, k is also an element of the branch set of j, because the node k is an initiator of a different path to the sink as well. The following lemma states the relation between the relay set and the branch set. Lemma 3.5: Let i, j ∈ N , s ∈ S . Then, we have i ∈ B sj ⇔ j ∈ Ri →s . Proof: The lemma is obvious from the definitions of these two sets. Assume i ∈ B sj , then there exists a sink node s ∈ S , where i ∈ N is the initiator node of the path Pi→s , and j ∈ Ri→ s . 30 Hence, i ∈ B sj ⇒ j ∈ Ri →s . Assume now, j ∈ Ri→ s . Then, there exists a path Pi→s , with the initiator node i ∈ N and sink node s ∈ S . Then, i ∈ B sj . Hence, j ∈ Ri→ s ⇒ i ∈ B sj . Lemma 3.6: Let j ∈ N be an arbitrary sensor node, and s ∈ S a sink node in a sensor ( ) network. For the relay matrix M R = r jis of the tree T min , we have B sj = ∑ r jis . i∈N Proof: From the definition of T min min , there exists a sink node s j ∈ S , such that Pjmin . For →s j ⊆ T is min min min i ∈ N , the definition of the relay matrix states, r j j = 1 , if Pi → , i.e., i ∈ Vi→ sj ⊆ T s j , and is min we have r j j = 0 , for all i ∈ N − Vi → sj . Therefore, B sj = ∑r is j i∈N ' + 0 = ∑ r jis + i∈N ' ∑r i∈N − N ' is j min = ∑ r jis , where N ′ = Vi→ sj . i∈N In order to calculate the total number of nodes that are connected to a sink node, it is sufficient to count all branch sets of the relay nodes, which are immediate predecessors of this sink node, i.e., which are only one hop away from the sink node. Lemma 3.7: Let be s ∈ S an arbitrary sink node in a sensor network. Then, B s = ∑∑∑ p isjk p jsjs . j ∈N i∈ N k ∈V Proof: For s ∈ S , B s = ∑B j∈ N ′ s j , where N ′ = { j ∈ N : a js = 1 } , i.e., N' is the set of sensor nodes which are directly connected to the sink node s. Therefore, B s = ∑ B sj a js . j∈ N For the minimum energy tree, we have a js = p jsjs , from Definition 3.6. 31 Then, using Lemma 3.6, and from Theorem 3.2, we have B s = ∑ ∑ rjis p jsjs = ∑ ∑ ∑ p isjk p jsjs = ∑∑∑ p isjk p jsjs . j∈ N i∈ N j ∈N i∈ N k ∈V j ∈N i∈ N k ∈V Definition 3.11: The branch size of a relay node j ∈ N is defined as b j = ∑ B sj . s∈S Corollary 3.2: Let T min be the minimum energy tree in a sensor network, and j ∈ N an arbitrary sensor node. Then, we have b j = ∑∑∑ p isjk . s∈S i∈N k∈V Proof: From Lemma 3.6, and from Theorem 3.2, the result follows. b j = ∑ B sj = ∑∑ rjis = ∑∑ ∑ p isjk = ∑∑∑ p isjk . s∈ S s∈S i∈ N s∈S i∈ N k ∈V s∈S i∈N k ∈V The routing algorithm should consider the number of nodes that are connected to a sink node b j , whenever a capacity constraint related with the sink node is stated. The routing tree should be constructed so that these limitations are not exceeded. 3.2.5. Energy Dissipation One of the major design concerns in wireless sensor networks is the energy dissipation at the sensor nodes. Routing algorithms should consider efficient management of the limited energy resources to increase the lifetime of individual nodes, therefore resulting in a network that is operational much longer. Here, we derive formulations for total energy dissipation at the node level and each data path in the network. 32 Definition 3.12: The total energy dissipation ei→s for a data packet, which is generated at an initiator node i ∈ N arriving at a sink node s ∈ S , is defined by ( ) min ei→s = e Pi→ s . Similarly, for the minimum energy tree T ( j, k ) ∈ A′ , the relay energy load min = (V', A'), and j ∈ N , k ∈ V where e j of a node j is defined by e j = e jk b j . We assume that the data packets choose to travel along the minimum energy paths in the network. If alternative paths can be used depending on the underlying routing algorithm, these definitions can be extended accordingly. Now, we calculate the total energy dissipation values. Lemma 3.8: Let i ∈ N be an arbitrary initiator node, and s ∈ S a sink node in a sensor network. Then, ei → s = ∑∑ e p isjk . jk j∈N k∈V Proof: ( ) min Let i ∈ N , s ∈ S . Then we have ei→s = e Pi→ s = ∑e jk from the definition of energy cost ( j , k )∈Aimin →s of the path. Let M A = (a jk )n×v be the adjacency matrix of T Then ∑e jk ( j ,k )∈Aimin →s = ∑e jk ( j ,k )∈Aimin →s +0 = ∑e jk ( j , k )∈Aimin →s a jk + ∑e ( j , k )∈A′ jk a jk = min . ∑e jk ( j ,k )∈N ×V a jk = ∑∑ e jk a jk where j∈N k∈V is A′ = N × V − Aimin → s . Finally, using Lemma 3.3 we have ei → s = ∑∑ e jk a jk = ∑∑ e jk p jk j∈N k∈V j∈N k∈V The total energy dissipation values ei→s give the energy cost of the communication path Pi→s in the sensor network. Energy aware routing algorithms should choose better sink alternatives such that this cost is minimized. Next, we calculate the relay load of sensor nodes. 33 Lemma 3.9: Let j ∈ N be an arbitrary relay node, and s ∈ S a sink node in a sensor network. Then, e j = ∑∑∑ e jk p isjk . s∈S i∈ N k ∈V Proof: Let i ∈ N , k ∈ V , T min = (V', A'), and ( j , k ) ∈ A′ . Then we have e j = e jk b j = ∑ e jk a jk b j = ∑ e jk a jk ∑∑∑ p isjv from Corollary 3.2. k∈V k∈V s∈S i∈N v∈V Therefore, we have e j = ∑∑∑∑ e jk a jk p isjv = ∑∑∑∑ e jk p isjk p isjv from Lemma 3.3. k ∈V s∈S i∈ N v∈V k ∈V s∈S i∈ N v∈V Here, p isjk p isjv = 1 only when k = v , k , v ∈ V , and p isjk p isjv = 0 otherwise. Therefore, by changing the index, we have e j = ∑∑∑ e jk p isjk . s∈S i∈N k∈V The relay energy load of sensor nodes should be controlled efficiently, so that the relay traffic is equally distributed among all relay nodes. An upper limit could be determined for this load, so that premature node failures could be prevented. 3.2.6. Counting the Packets In order to manage the energy resources of sensor nodes efficiently, we must have a mechanism to quantify the number of packets that these sensor nodes deal with during a time frame. Definition 3.13: The number of packets going through a relay node j ∈ N during a time interval (0,t] is denoted as n Rj (t ) . Similarly, the number of packets generated by an initiator node i ∈ N during a time interval (0,t] is denoted as niG (t ) . Equivalently, n Rj (t ) = ∑ ∑ niG (t ) . s∈S i∈B sj 34 Lemma 3.10: Let j ∈ N be an arbitrary sensor node in a sensor network. Then, n Rj (t ) = ∑ ∑ niG (t )r jis . s∈S i∈N Proof: Let i ∈ B sj be an arbitrary initiator node in the branch set of j, and s ∈ S a sink node. Then from Lemma 3.5, we know j ∈ Ri → s , and therefore, r jis = 1 . Then, we have n Rj (t ) = ∑ ∑ niG (t )rjis . s∈S i∈B sj For i ∈ N − B sj , with a similar argumentation we have rjis = 0 . Hence, G is G is nG (t )r is + ( ) ( ) n t r n t r = ∑i j ∑ ∑i j j ∑ i s∈S i∈N − B sj s∈S i∈B sj i∈ N − B sj n Rj (t ) = ∑ ∑ niG (t )rjis + 0 = ∑ ∑ niG (t )rjis + ∑ s∈S i∈B sj s∈S i∈B sj n Rj (t ) = ∑∑ niG (t )rjis . s∈S i∈N Corollary 3.3: Let j ∈ N be an arbitrary sensor node in a sensor network. Then, we have n Rj (t ) = ∑∑∑ niG (t ) p isjk . s∈S i∈ N k ∈V Proof: From Lemma 3.10 and from Theorem 3.2, the result follows. n Rj (t ) = ∑∑ niG (t )rjis = ∑∑ niG (t ) ∑ p isjk = ∑∑∑ niG (t ) p isjk . s∈S i∈ N s∈S i∈ N k ∈V s∈S i∈N k ∈V In order to quantify the number of packets generated during a time period niG (t ) , we have to clarify the packet generation process, which is a random process whose behavior is not predictable in all its details. This process may have an arbitrary probability distribution, which is possibly known a priori, as it is closely related with the underlying sensor application. 35 Definition 3.14: Let i ∈ N be an arbitrary initiator node. Let X i (t ) denote the number of packets generated during the time interval (0,t]. Then {X i (t ), t ≥ 0} is a family of random variables, forming a stochastic process. Let Z i(n ) denote the nth interarrival time, n ≥ 1 , Z i(n ) = t n − t n−1 with a known cumulative distribution function, and known expected value ( ) P Z i(n ) ≤ x = Fi ( x ) [ ] E Z i(n ) = µ i . 0 Z i(1) n=1 n=2 n=3 t1 t2 t3 Z i( 2 ) t Z i( 3) Figure 3.6. The packet generation interarrival times Z i(n ) for the initiator node i Recall 3.1: For t → ∞ , we have E [X i ] = t . µi Proof: For the proof, the reader may refer to [39]. Corollary 3.4: Let i ∈ N be an arbitrary sensor node, having an average packet interarrival time µ i . For t → ∞ , niG (t ) = t . µi 36 As an example, assume a Poisson packet generation process with parameter λi , where the interarrival time has a cumulative distribution function Fi ( x ) = 1 − e −λi x . Then, we know that µ i = 1 λi , hence niG (t ) = λi t . We further assume that the packet generation processes of each individual sensor node are independent of each other. For general-purpose continuous monitoring applications, this assumption clearly holds. Sensors are going to send their measurements in independent moments in time. For some applications, like seismic measurements, this assumption might not hold. That is, when a special event occurs in the field, then every node that is close to this node will try to send a packet to inform the sink node. Most of these applications, however, use a data aggregation mechanism where the data packets that are generated separately are joined into a single packet. Then, only this packet is forwarded to the sink. Considering only those packets as real packet generations for the sensor network, the assumption holds for these types of sensor networks too. The number of packets that a relay node has to forward towards a sink node can be found as follows. Lemma 3.11: Let j ∈ N be an arbitrary node in a sensor network where the packet generation of the sensor nodes is independent. Then, for t → ∞ , n Rj (t ) = ∑∑∑ (t µi ) p isjk . s∈S i∈ N k ∈V Proof: { } Let X Rj (t ), t ≥ 0 be the random process counting the number of packets going through a relay node j ∈ N during a time interval (0,t], where X Rj (t ) = ∑ ∑ X i (t ) . Then we have s∈S i∈B sj E X Rj (t ) = E ∑ ∑ X i (t ) = ∑ ∑ E [X i (t )] , since X i (t ) , i ∈ B sj are independent. s∈S i∈B sj s∈S i∈B sj [ ] [ ] Therefore, for t → ∞ , we have n Rj (t ) = lim E X Rj (t ) = ∑ ∑ (t µi ) . t →∞ s∈S i∈B sj 37 Using the same technique in Corollary 3.3, we have n Rj (t ) = ∑ ∑ (t µ i ) = ∑∑ (t µ i )rjis = ∑∑∑ (t µ i ) p isjk s∈S i∈B sj s∈ S i ∈ N s∈S i∈N k ∈V Corollary 3.5: Let j ∈ N be an arbitrary node in a sensor network. When all initiator nodes have the same average packet interarrival time µ, that is µ = µ i , for all i ∈ N , then n Rj (t ) = t ∑∑∑ pis . µ s∈S i∈N k∈V jk Proof: The result follows directly from Lemma 3.11. 3.2.7. Node Lifetime The lifetime of the sensor network is closely dependent to the lifetime of each individual sensor in the network. Whenever a node failure occurs, all the branch nodes would be unreachable until a new route discovery process is initiated. Therefore, we have to control the lifetime of each sensor node and try to prolong it as much as possible, in order to maximize the network lifetime. Definition 3.15: Let E j (t ) be the residual energy of a node j ∈ N at a given time t. Then E j (0 ) denotes the initial battery capacity of the node j ∈ N . The node j ∈ N is said to be alive whenever E j (t ) > 0 . Similarly, the node j ∈ N is said to be exhausted or dead whenever E j (t ) = 0 . Let e j (t ) denote the total energy dissipation of a node j ∈ N during a time interval (0,t]. Then we have E j (t ) = E j (0 ) − e j (t ) . Clearly, the residual energy of a sensor node E j (t ) is a monotonically decreasing, real-valued function of time. We have to find the approximate time when an operational node becomes exhausted. 38 Lemma 3.12: Let j ∈ N be an arbitrary node in a sensor network where the packet generation of the sensor nodes is independent. Then, for t → ∞ , e j (t ) = ∑∑∑ (t µ i ) p isjk e jk . s∈S i∈N k ∈V Proof: Obvious from Lemma 3.9, and Lemma 3.11. Corollary 3.6: Let j ∈ N be an arbitrary node in a sensor network. Node j becomes exhausted when t= E j (0 ) ∑∑∑ (1 µ ) p i is jk e jk . s∈S i∈ N k ∈V Proof: The derivation is obvious from Lemma 3.12. We want to find t such that E j (t ) = 0 . That is, E j (t ) = E j (0 ) − e j (t ) = E j (0) − ∑∑∑ (t µi ) p isjk e jk = 0 . s∈S i∈ N k ∈V Hence, E j (0) = t ⋅ ∑∑∑ (1 µ i ) p isjk e jk and the result follows. s∈S i∈ N k ∈V If we are using same sensor nodes throughout the sensor field, then we can easily assume that their initial battery capacities are equal. Therefore, we can rewrite the maximization objective where we try to maximize the lifetime of sensor nodes, as a simpler minimization problem, where we try to minimize the denominator, i.e., ∑∑∑ (1 µ ) p i is jk e jk . s∈S i∈ N k ∈V 3.2.8. Investment Cost The most important decision point of an investor is the total investment cost, i.e., the total budget of the system. Therefore, we have to define parameters to calculate the total outcome to build an operational sensor network accompanying multiple sink nodes. 39 During the lifetime of the network, redeployment of additional sensor nodes might be necessary. This requirement might be because of node failures having exhausted batteries, as well as because of changed monitoring needs where measurements that are more precise become necessary. The number of deployed sensor nodes is the most basic budget entry. Here, we make an extension on Definition 3.1 where we define the set of sensor nodes. Definition 3.16: Let r ∈ Ζ ∗ represent the number of redeployments in a sensor network, where Ζ∗ = { 0, 1, 2, ... } . Let N r denote the set of sensor nodes in the r-th redeployment. Let nr = N r denote the number of sensor nodes in the r-th redeployment. Then r N = t Ni i =0 r n = ∑ ni . i =0 From the definition, it is clear that r = 0 represents the initial deployment of the sensor nodes, and r ≥ 1 represents additional redeployments to the network. Definition 3.17: The cost of a sensor node at the r-th redeployment is given as crN ∈ ℜ . The cost of deployment action of the r-th redeployment is given as crD ∈ ℜ . The cost of a sink node s ∈ S is given as csS ∈ ℜ . The cost of placement of a sink node s ∈ S is a real-valued function csP : ℜ 2 → ℜ , having the coordinates ( x s , y s ) of the sink node s as arguments. These unit cost parameters will be used to calculate the total budget of the sensor network operation. The cost of individual sensor nodes could vary between redeployment attempts because the number of deployed sensors will vary. Considering the scale of economy, the more sensor nodes are deployed, the less is the unit price. At each deployment phase, a constant cost will be given depending on the labor and the vehicle used. If the sensors are scattered to a large region from an aircraft then a much higher deployment action cost should be paid, compared to a deployment where only a smaller 40 region and manual placement is considered. For the sink nodes, as the sink nodes might have different capabilities, their cost might be different. If the investor would choose identical sink nodes, then csS = c S , for all s ∈ S . Finally, the placement of the sink nodes might be dependent on environmental restrictions, where the cost is related with the geographical coordinates of the location. Then the regional placement cost function csP ( xs , y s ) of the sink s ∈ S will be different for such locations. Now, we can define the total cost for the sensor network investment. Definition 3.18: The total cost of sensor nodes at the r-th redeployment is given as C rN ∈ ℜ , where C rN = nr crN . Then, the total cost of sensor nodes is given by C N ∈ ℜ , where r r i =0 i =0 C N = ∑ CiN = ∑ ni ciN . The total cost of deployment action is given by C D ∈ ℜ , where r C D = ∑ ciD . i =0 The total cost of sink nodes is given by C S ∈ ℜ , where C S = ∑ c sS . s∈S The total cost of sink node placements is given by C P ∈ ℜ , where C P = ∑ c sP ( xs , y s ) . s∈S The total investment cost is given by C ∈ ℜ , where r r i =0 i =0 C = ∑ ni ciN + ∑ ciD + ∑ csS + ∑ csP ( xs , y s ) . s∈S s∈S The total investment cost is the sum of all deployment costs of a sensor network operation. For an economically feasible operation, the investor needs to minimize this cost as much as possible, regarding the other constraints in the network. 41 3.3. Summary In this chapter, we have derived several new formulations that are necessary to build a framework for the multiple sink sensor network design problem. Important characteristics of the sensor network infrastructure have been analyzed and the necessary definitions have been introduced. These definitions and formulations will be used in the following chapters to quantify the energy dissipation at the sensor nodes. Thereafter, design objectives related to the multiple sink sensor network design problem will be presented based on these formulations. 42 4. QUANTIFYING SAVED ENERGY BY MULTI-HOPPING Sensor nodes have a short transmission range due to their limited radio capabilities. Therefore, the data must be relayed using intermediate nodes towards the sink. In addition, it may be more advantageous to use a multi-hop path to the sink node consisting of shorter links rather than using a single long connection. The energy consumption at the transmitter is known to be proportional to d α where d is the range of the radio signals and α is the path loss exponent [7, 40-43]. In [40], a minimum energy connection protocol based on the distributed Bellman-Ford algorithm is investigated. The effect of mobilization is also analyzed. In [41], a power-aware routing algorithm for wireless ad-hoc networks is presented, which helps to minimize the transmission power needed to forward data packets. In [42], directional antennae are used to construct the minimum energy tree. Here again, the cost of a link is assumed to consist of only the dominant component, i.e., the transmitter energy. Energy efficiency on constructing multicast trees on wireless networks is considered in [44], where the energy gain is focused on transmitter energy. There are also different studies for energy based optimizations. In [45], optimum one-hop transmission distance is calculated that will minimize the total system energy. In this work, it is assumed that each node is communicating with its next-hop node in a linear network topology. In [46], a communication protocol for wireless sensor networks is proposed, based on energy efficiency. Here, only free space propagation model is assumed and the effects of different path loss exponent values are not investigated. A different minimum energy routing model is proposed in [47], where the effects of shadowing and fading is also considered. Although the importance of the receiver energy is not opposed, this factor is neglected in detailed analysis. When only the transmission energy is considered, using shorter multi-hop links seems to be more advantageous. However, due to other energy consuming activities on the sensor nodes, such as reception of relayed messages, sensing and computation tasks, a considerable overhead energy might be dissipated during forwarding a message. Therefore, multi-hopping is not always advantageous in wireless sensor networks. 43 In this section, we try to investigate, when the usage of an intermediate node results in energy gain. We analyze the amount of energy gain using multi-hop links to construct a communication path. We focus on uniformly deployed sensor nodes, each having identical communication capabilities. The sensor nodes are assumed to be able to adjust their transmission power. Therefore, each sensor consumes only the amount of energy that will suffice to reach for the transmitted radio waves to the destined receiver antenna. A similar transmitter model is proposed in [48]. 4.1. Network Model 4.1.1. Assumptions The sensor network consists of sensor nodes and one or more sink nodes where the results of sensor measurements are collected. If the sensors are equipped with undirected antennae then each node is connected to every other node within the transmission range of its radio signals. This situation is presented in Figure 4.1. Sensor node i0 is able to transmit radio signals at three different power levels P1, P2, and P3. Depending on the power level, sensor nodes in varying distance to the originator node start to receive signals. i2 P1 i5 i0 i1 i6 i2 i2 P2 i4 (a) P3 P1 i5 i5 i0 i1 i3 i6 i2 P2 i1 i3 i6 i4 i4 (c) (b) P3 i5 i0 i0 i1 i3 i6 i3 i4 (d) Figure 4.1. Radio transmission with different power levels result in different transmission range The sensors are assumed identical having the same radio equipment. Therefore, ignoring the environmental obstacles, whenever a node i can reach to another node j, it is 44 evident that backward communication is also possible, i.e., node i can be reached by node j. Routing decisions will dictate sensor nodes with different transmission power levels in order to save energy. Therefore, it may easily happen that node i transmitting with a high power level to reach to a distant node j, and node j transmitting with a lower power level to a closer node k. In this case, it is clear that node j cannot be heard by node i. Therefore, we assume directed edges in the network graph G. 4.1.2. Multi-Hop Links During selection of the most energy effective route, alternative links must be considered. In the simplest case, one has to choose between a direct link from source to destination and a multi-hop link using intermediate nodes, if available. Figure 4.2 shows such a subproblem during routing decision. A communication request between nodes i and j may trivially result in a direct link (i, j) between those two nodes, whereas a “good” alternative would be found by using the intermediate node k resulting in the path <i, k, j>. i j k Figure 4.2. Using multi-hop links in routing decisions 4.2. Energy Saving Routing algorithms in sensor networks should consider communication links with less energy consumption among other alternatives. Suppose that we have two sensor nodes i and j within the sensor field where node i wants to send a data packet to node j. This situation is represented in Figure 4.3 (a). Trivially, node i should adjust its transmitter circuitry power so that node j will receive the transmitted signals. Alternatively, the routing algorithm may decide to use an intermediate node k, which is lying between both 45 the transmitter and the receiver nodes. Energy saving, δ E , can be formulated as the difference of total energy consumption between two alternatives δ E = eTotal (1) − eTotal ( 2) where eTotal (1) and eTotal ( 2) (4.1) give the total energy consumption values of these two alternatives, respectively. Here, we will consider three different scenarios where an intermediate node can be used, and compare the energy saving achieved at each scenario. i j k i dij j dik dkj (a) (b) k(3) k(2) k(1) i d ij / 2 j d ij / 2 (c) Figure 4.3. Routing decision alternatives, (a) direct communication, (b) and (c) using an intermediate node 4.2.1. 1-D Communication Links In the simplest case, we assume a one-dimensional environment. Here, the intermediate node k lies on the line connecting the source and the destination nodes, as given in Figure 4.3 (b). It is clear that energy loss would occur when node k would be beyond node i or node j. Therefore, we consider 0 ≤ d ik , d kj ≤ d ij . Using Equation 2.6, we have 46 where eTotal (1) α eTotal (1) = eij = κdij + τ eTotal ( 2) = eik + ekj = κdik + τ + κd kj + τ ( ) ( α α (4.2) ) gives the total energy consumption when a direct communication link between nodes i and j is established, and eTotal ( 2) gives the total energy consumption when an intermediate node k is used. Therefore, a two-hop communication path is utilized, the first link connects node i with node k, and the second link connects node k with node j. By using Equation 4.2, energy saving can be found as follows. [ ] δ E = κ d ij α − d ik α − d kj α − τ (4.3) δE eTotal (1) Energy Saving δ E (max) 0 d ij / 2 d ij d ik −τ 0 Intermediate Node Position Figure 4.4. Energy saving in 1-D communication scenario Here, using the fact that d ij = d ik + d kj , we get [( α δ E = κ d ij − d ik α )− (d ij − d ik ) α ] −τ (4.4) 47 We keep the distance between node i and node j constant and observe the energy saving behavior. An intermediate node k is used that is found on the line between node i and node j. For simplicity, we take τ = τ 0 , an arbitrary fixed energy requirement at each sensor node. The behavior can be observed in Figure 4.4. When node k is close to the source or the receiver, a significant amount of energy loss occurs. Using an intermediate node becomes only meaningful when this node is distant from both the sender and the receiver. For different values of path loss exponent α, this behavior remains the same. However, the amount of energy that is required for successful data transmission increases exponentially as we can see in Figure 4.5. δE 10 7 α =6 Energy Saving 106 α =5 105 104 α =4 103 α =3 102 101 d ik d ij / 2 d ij Intermediate Node Position Figure 4.5. Effect of α on energy saving in 1-D communication scenario The point of maximum energy saving can be found by setting δ E′ (dik ) = 0 . The first derivative of energy saving with respect to distance between node i and node k can be written as follows. [ dδ E α −1 = ακ (d ij − d ik ) α −1 − d ik dd ik ] (4.5) 48 Here, we have δ E′ (dik ) = 0 if d ik = d ij 2 . In other words, maximum energy saving would be achieved when node k is exactly on the midpoint between node i and node j. Using this result, we can find the places for an intermediate node where energy is saved when this node is used as a relay node. In other words, we want to find dik, so that δ E > 0 . Setting d ij = 2d ik in Equation 4.4, we get δ E = 2(2 α −1 − 1) κd ik α − τ (4.6) Therefore, we can say that δ E > 0 , whenever we have an intermediate node whose distance from the source node is found as follows. τ dik > α −1 2(2 − 1)κ 1α (4.7) Equation 4.6 provides with another important result. We know from Equation 2.6 that eT (dik ) = κdik . Therefore, we can conclude with an energy saving, whenever the α following inequality between the overhead energy τ and transmitter energy eT holds. τ < 2(2 α −1 − 1) eT (4.8) 4.2.2. Isosceles Triangular Communication Links In the second scenario, we let the intermediate node lie on the top corner of an isosceles triangle whose other two corners are the source and the destination nodes. This scenario is presented in Figure 4.3 (c). Obviously, the distance between the intermediate node and either the source or the destination cannot be larger than the distance of a direct link between the source and the destination. Therefore, in this case, we consider d ij 2 ≤ d ik , d kj ≤ d ij . 49 Since the routing triangle is isosceles, we know dik = d kj . Therefore, the energy saving defined in Equation 4.3 can be represented as follows. ( ) δ E = κ dijα − 2dik α − τ (4.9) δE Energy Saving δ E (max) − eTotal d ij 0 d ik d ij / 2 (1) Intermediate Node Position Figure 4.6. Energy saving in isosceles triangular communication scenario The energy saving with respect to increasing dik can be seen in Figure 4.6. It is monotonically decreasing because the total distance of data links is increasing and more transmitter energy would be necessary to communicate. Maximum energy saving is achieved when the intermediate node k lies on the line connecting node i and node j. In Figure 4.7, we observe similar behavior for different values of path loss exponent α, although the amount of energy increases exponentially. In order to find the places where the amount of energy saving is positive, we put δ E > 0 in Equation 4.9 and derive the following inequality. 1 τ d ik < d ijα − κ 2 1α (4.10) 50 δE 107 106 α =6 Energy Saving 105 α =5 104 α =4 103 102 α =3 101 d ij / 2 d ij d ik Intermediate Node Position Figure 4.7. Effect of α on energy saving in isosceles triangular communication scenario 4.2.3. Arbitrary Triangular Communication Links In real life situations, however, arbitrary triangular routing alternatives will be found, and the routing algorithm has to decide whether to choose the direct link or to choose the multi-hop one. Figure 4.8 shows this scenario. (x, h) k h (dij, 0) (0, 0) i j x dij − x Figure 4.8. Arbitrary triangular communication scenario Here, we assume that the nodes are lying on a 2-D plane where node i is at the origin, and node j lies on the x-axis with coordinates (dij, 0). Then, the intermediate node k has coordinates (x, h), where h is the height of the triangle. In this case, energy saving can ( ) be found as follows, using d ik 2 = h 2 + x 2 , and d kj 2 = h 2 + d ij − x 2 . 51 [ δ E = κ dij − (h 2 + x 2 ) α α 2 ( − h 2 + (dij − x ) ) 2 α 2 ]− τ (4.11) This equation is plotted in Figure 4.9. We have seen this behavior in the first two scenarios. The generalization can easily be reduced to these scenarios by putting h = 0 or x = d ij 2 for the first and second scenarios respectively. Figure 4.9. Energy saving in arbitrary triangular communication scenario 4.2.4. Generalization Until now, we have presented techniques for two-hop scenarios. However, these techniques can easily be applied recursively on situations where a multi-hop communication link should be considered as an alternative. Considering the situation in Figure 4.10 (a), when node i wants to reach node j, there might be more than one intermediate node, such as nodes k and l. In this case, the 52 underlying routing algorithm should consider the amount of energy saving when nodes k and l are used as relay nodes. k k l i j (a) i j (b) Figure 4.10. Generalization into a multi-hop path When a distributed routing algorithm is used, node i will decide on its output power level according to its neighbors. Therefore, node i will compare the alternatives (i, j) with the path <i, k, j>, as in Figure 4.10 (b). node i is not responsible on the routing decisions of node k. Therefore, node k should decide whether to send packets through node l or sending it directly to node j. 4.3. Simulations on the Energy Savings by Multi-hopping In order to validate the effect of utilizing multi-hop communication links in energy saving, we performed simulations using Opnet Modeler [49] on different scenarios. 4.3.1. Simulation Setup We focus in our simulations on three different types of sensor nodes varying on their transmission power adjustment capability. The first node type, Pmax, is unable to make any power control on transmitter circuitry. This type of nodes should always send packets with the maximum transmission power, independent of the distance between source and destination nodes. The second node type, P3, can adjust its transmission power at three different power levels, whereas the third node type, Pcont, has a continuous power level adjustment capability. In simulations, however, we have used 20 discrete power levels instead of a continuous scale. P3 and Pcont type sensors try to change their transmission power to the minimum level that will be sufficient for their radio packets to reach to their destination. For each experiment, 10 different random sensor networks are generated. The 53 graphs are plotted using the average values derived from these networks, with a 95 per cent confidence interval. Each sensor network consists of one sink node and 100 sensor nodes. The sink node is located in the middle of the area, whereas the sensor nodes are distributed uniformly. We have also considered locating the sink node to one of the corners of the area, which did not change the overall behavior of the system. Table 4.1. Simulation parameters Parameter Value Sample transmission power 800 mW Sample transmission range 200 m Data rate 20 kbps Packet size 1024 bits Minimum transmission power 100 mW Maximum transmission power 2,000 mW Default area size (A) 200 m x 200 m Default path loss exponent (α) Number of sensor nodes 3 100 The sensors are assumed to use 800 mW transmission power for a 200 m radio range in open air ( α = 2 ). These values are chosen, as they are very close to the Berkeley/Crossbow Mica Motes’ specifications [8]. This data is used to calculate the corresponding radio range for each different environment types with different path loss exponent values. These assumptions are summarized in Table 4.1. The energy model in Equation 2.6 is used to calculate the average energy spent at each sensor node for one packet transmission. Here, the overhead energy τ has a typical value of 20 mJ per packet where 400 mW receiver power is assumed, and both sensing and 54 computation energy is neglected. However, we have considered τ = 0 mJ to 50 mJ to examine the effect of different overhead energy levels. In this work, we monitor the average hop count and the average energy spent per packet at each node. These values are calculated as follows. After the network setup phase, a communication tree is formed. Thereafter, for each sensor, the communication path from itself to the sink node is traversed, and both the number of hops and the necessary energy is recorded. 4.3.2. Results In the first experiment, the default simulation parameters are used. The results are presented in Figure 4.11 and Figure 4.12. 6 P (cont) P (3 levels) 5 Average Hop Count.. P (max) 4 3 2 1 0 0 10 20 30 40 50 Overhead Energy (mJ) Figure 4.11. Average hop count versus overhead energy τ (A = 200 m x 200 m, α = 3) Multi-hop communication paths are utilized whenever the overhead at each hop is small. Therefore, at higher overhead energy values, direct links are preferred to multi-hop paths. When the sensors are communicating with the maximum transmission power, then the resulting routing tree will be independent of the overhead energy, i.e., each sensor will 55 try to communicate with the one that is furthest away from itself. Hence, we have a constant average hop count for Pmax nodes. Since Pcont nodes can make a finer power adjustment than P3 nodes, this optimization results in a higher average hop count. 400 P (max) Average Node Energy (mJ).. 350 P (3 levels) P (cont) 300 250 200 150 100 50 0 0 10 20 30 40 50 Overhead Energy (mJ) Figure 4.12. Average node energy versus overhead energy τ (A = 200 m x 200 m, α = 3) It is evident that average node energy should increase when the overhead energy increases (see Figure 4.12), since this is a constant additive of the total energy per node. The increase in the total energy, however, is more than the additional overhead. The reason for this is that the tendency to use direct links increases as the overhead energy increases, which require more energy than multi-hop paths consisting of shorter links. As the amount of power adjustment levels increases, the energy spent at each node decreases. In other words, sensor nodes can use their energy more effectively. As an example, for the typical case where τ = 20 mJ, Pmax nodes spend on the average eTotal = 282 mJ, whereas Pcont nodes spend on the average eTotal = 137 mJ. This results in an improvement of more than 50 per cent energy saving, which doubles the lifetime of each sensor node. In Figure 4.13, we consider only the Pcont nodes. Here, the results of all experiments with different overhead energy values are plotted. The trendline indicates clearly that whenever the network is able to use multi-hop links, average node energy decreases. The 56 usage of multi-hop links, however, is determined by considering the amount of the overhead energy, as we have seen in Figure 4.11. 300 Average Node Energy (mJ).. 250 200 150 100 50 0 2 3 4 5 6 Average Hop Count Figure 4.13. Average node energy versus average hop count (A = 200 m x 200 m, α = 3, only Pcont nodes are used) 1000 Average Node Energy (mJ).. P (max) P (3 levels) 800 P (cont) 600 400 200 0 0 10 20 30 40 50 Overhead Energy (mJ) Figure 4.14. Average node energy versus overhead energy τ (A = 400 m x 400 m, α = 3) 57 In the second experiment, the effect of sensor density is analyzed. Therefore, the area size is increased to 400 m x 400 m while the number of sensors is kept the same. As shown in Figure 4.14, the network shows the same behavior as in the dense scenario, with a difference that the average node energy requirement becomes larger. This is because the average distance between each sensor node has been increased. For our typical case where τ = 20 mJ, the improvement achieved by using Pcont nodes instead of Pmax nodes is found as 42 per cent, which again approximately doubles the lifetime of each sensor node. The third experiment focuses on different environmental conditions by varying path loss exponent α. In this experiment, only Pcont nodes are used which are proven to provide with the most efficient energy management scheme. 9 α = 3.5 α=3 α = 2.5 α=2 8 Average Hop Count.. 7 6 5 4 3 2 1 0 0 10 20 30 40 50 Overhead Energy (mJ) Figure 4.15. Average hop count versus overhead energy τ (A = 200 m x 200 m, only Pcont nodes are used) We know that in urban areas or in more obstructed environments, the value of α increases. Therefore, radio transmission range decreases for the same transmission power values. As a result, the sensor nodes can be connected to the sink node only with shorter links, and therefore using more hops (see Figure 4.15). Moreover, we can clearly observe that the degree of multi-hopping reduces when the overhead energy at each sensor node 58 increases. This shows that the nodes prefer rather direct links than multi-hop paths. For α values greater than 4, even the maximum transmission power that our sensor nodes are capable becomes insufficient to form a connected network. In rural areas (α = 2), however, the sensors can be more densely deployed, as the radio range is higher. In our experiment, each sensor node starts to communicate via a direct link with the sink node, as the overhead for using a multi-hop link is relatively high. Only for the case where the overhead is omitted (τ = 0 mJ), some multi-hop links are established. In Figure 4.16, we observe that the energy dissipation of each sensor node is exponentially related with the path loss exponent. Therefore, in more obstructed environments, one must expect shorter sensor lifetime, which is exponentially related with α. 600 α = 3.5 α=3 α = 2.5 α=2 Average Node Energy (mJ).. 500 400 300 200 100 0 0 10 20 30 40 50 Overhead Energy (mJ) Figure 4.16. Average node energy versus overhead energy τ (A = 200 m x 200 m, only Pcont nodes are used) An interesting result is that, the average hop count is also exponentially related with the path loss exponent. The typical case with τ = 20 mJ is shown in Figure 4.17. Here, we have a larger confidence interval for larger α values, since the interconnection degree of the network decreases, which results in more deviated values. However, the exponential trend can easily be seen, since we use a logarithmic scale. Therefore, the degree of 59 multi-hopping increases with increasing path loss exponent exponentially, which is increases the end-to-end delay and packet loss rate, but decreases the total energy dissipation. Average Hop Count.. 10 1 1,5 2 2,5 3 3,5 4 Path Loss Exponent Figure 4.17. Average hop count versus path loss exponent α (A = 200 m x 200 m, τ = 20 mJ, only Pcont nodes are used) 4.4. Conclusions on the Energy Savings by Multi-hopping In order to maximize the network lifetime, energy resources of each individual sensor node must be consumed effectively. Using multi-hop paths that consist of shorter links instead of one long link might result in considerable energy gain. In this chapter, we proposed a new analytical approach to quantify energy saving using multi-hopping and power level adjustments. We have studied different multi-hop communication scenarios and calculated the energy saving in each scenario. We have also expanded these scenarios to general cases. The generalization can be applied into any arbitrary triangle and can be used in energy optimized route calculations. We also tried to quantify the effect of path loss exponent α, and overhead energy τ on energy saving. These analytical methods can be used for developing faster power aware routing algorithms. We have also validated our analytical study using simulations. 60 Although the transmitter energy reduces by using multi-hop communication links, we have shown that the total communication energy might increase depending on the overhead energy that has to be dissipated at every hop in the network. Therefore, the degree of hopping should decrease whenever higher overhead energy values are under consideration. We have compared the effect of overhead energy with average hop count and with average node energy per packet on different scenarios. It is shown that the sensor lifetime can easily be doubled using power adjustable transmitter circuitry. 61 5. THE EFFECT OF OVERHEAD ENERGY TO THE NETWORK LIFETIME Although the transmitter energy is one of the major factors of total energy dissipation in a sensor node, neglecting the overhead energy in energy aware routing decisions could result in suboptimal energy usage. Routing algorithms should be concerned about the overhead energy, which is wasted at each hop of data transfer. When only the transmission energy is considered as the communication cost, using shorter multi-hop links seems to be more advantageous. However, due to other energy consuming activities on the sensor nodes, such as reception of relayed messages, sensing and computation tasks, a considerable overhead energy might be dissipated while forwarding a message. Therefore, multi-hopping becomes not always advantageous in wireless sensor networks. In this chapter, the use of multi-hop communication links is investigated, and the amount of energy gain that is acquired by correct routing energy calculations is compared. We show that neglecting the overhead energy and overemphasizing the importance of power adjustable transmitter circuitry could result in considerable energy loss. 5.1. Motivation for Overhead Energy Considerations The path loss exponent α has a great impact on energy dissipation at the sensor nodes, since the transmitter energy is proportional to d α where d is the range of the radio signals. On the other hand, the route calculations should also consider the overhead energy dissipation at the sensor nodes, which include the receiver energy, the computation energy, and the sensing energy. These overhead energy requirements and path loss exponent values may result in different minimum energy tree structures, consequently different routing topologies. Consider a small wireless sensor network with three sensor nodes i1, i2, i3 and one sink node s whose layout is given in Figure 5.1. Even in such a small network, we can see that routing decisions based on energy calculations may result in different routes 62 depending on the assumptions about the underlying model. Figure 5.1 (a) and (c) shows the minimum energy routing tree where the overhead energy τ is neglected during routing calculations assuming τ = 0 mJ, for different environmental situations with α = 2 and α = 3 respectively. In real world sensor nodes, however, we must not forget the overhead energy, which is dissipated at each hop of data transfer. Assuming a realistic overhead energy value with τ = 20 mJ, different routing topologies would be found which are presented in Figure 5.1 (b) and (d). These alternatives show that the actual minimum energy routes are different from the initial ones. The most important point is that, neglecting the significance of the overhead energy dissipation would result in a considerable amount of energy waste. i1 i1 s i2 s i2 i3 i3 (a) α = 2, τ = 0 mJ (b) α = 2, τ = 20 mJ i1 i1 s i2 s i2 i3 (c) α = 3, τ = 0 mJ i3 (d) α = 3, τ = 20 mJ Figure 5.1. A sample network representing different topology alternatives for different path loss exponent α and overhead energy τ values In In summary, the overhead energy is an intrinsic component of energy dissipation at sensor nodes. Neglecting this important factor during routing decisions may result in worse routing alternatives while promoting meaningless multi-hop communication links and resulting in a significant amount of energy waste. Table 5.1, the average energy dissipations at sensor nodes are compared for the small sensor network given in Figure 5.1. The routing topologies where only the transmitter 63 energy is considered and the overhead energy is not taken into account will cause an obvious energy waste on sensor nodes. In summary, the overhead energy is an intrinsic component of energy dissipation at sensor nodes. Neglecting this important factor during routing decisions may result in worse routing alternatives while promoting meaningless multi-hop communication links and resulting in a significant amount of energy waste. Table 5.1. Average energy dissipation at sensor nodes Explanation α=2 α=3 E (mJ) Topology at Figure 5.1 (b), where τ is considered 21.13 Topology at Figure 5.1 (a), where τ is ignored 34.28 Relative energy loss (%) 62 % Topology at Figure 5.1 (d), where τ is considered 53.31 Topology at Figure 5.1 (c), where τ is ignored 61.80 Relative energy loss (%) 16 % 5.2. Simulations on the Effect of Overhead Energy In order to visualize the effect of neglecting the overhead energy parameter during routing calculations, we performed simulations using Opnet Modeler [49]. We have implemented two similar minimum energy tree construction algorithms based on the Distributed Bellman-Ford Algorithm [50]. In the first case, the “Ignore” algorithm (IA) considers only the transmitter energy and tries to establish connections between nodes where the transmission power is minimized, while ignoring the overhead energy dissipation at each hop. In the second case, the “Consider” algorithm (CA) considers the total energy cost as given in Equation 2.6 while constructing the routing tree. 64 5.2.1. Simulation Setup The sensor nodes are assumed to be capable of adjusting their transmitter power to the minimum required level that will be sufficient for their radio packets to reach to their destination. In simulations, however, we have used 20 discrete power levels instead of a continuous scale. For each experiment, 10 different random sensor networks are generated. The graphs are plotted using the average values derived from these networks, with a 95 per cent confidence interval. Each sensor network consists of one sink node and 100 sensor nodes. The sink node is located in the middle of the area, whereas the sensor nodes are distributed uniformly. We have also considered locating the sink node to one of the corners of the area, which did not change the overall behavior of the system. The sensors are assumed to use 800 mW transmission power for a 200 m radio range in open air ( α = 2 ). These values are chosen, as they are very close to the Berkeley/Crossbow Mica Motes’ specifications [8]. However, we have scaled the radio range for our simulation environment where we used a constant path loss exponent value with α = 3 . The initial battery capacity of the sensors is chosen to be 200 J. In [51], it is given that for an alkaline-manganese dioxide battery, the typical volumetric energy density is 428 Watt hour per liter. In other words, a battery of size one cubic centimeter would have the capacity 1540 J. However, we have chosen a smaller value to shorten the simulation time. The behavior of the simulations will not change, since the battery capacity only causes the results to appear earlier. The sensors are assumed to perform independent readings, and therefore independent packet generations. The packet generation process is assumed to be a Poisson process with rate λ = 1 packets per hour, where we assume a continuous monitoring application. Nevertheless, here a periodic process could also be chosen where the sensors are polled 65 with a predefined frequency. The energy model in Equation 2.6 is used to calculate the average energy spent at each sensor node for one packet transmission. We have considered τ = 0 mJ to 100 mJ to examine the effect of different overhead energy levels. The network lifetime is defined as the length of time until the first battery drain-out among all sensor nodes occurs [52] T = min{ t : E j (t ) = 0, t ∈ ℜ j∈N } (12) where the sensor’s energy reserve E j (t ) is defined as a monotonically decreasing function of time. Other alternatives for the network lifetime are discussed in Section 6.1.2. 5.2.2. Results In this work, we monitor the network lifetime, the average hop count and the average energy spent per packet at each node. These values are calculated as follows. After the network setup phase, a communication tree is formed. Thereafter, for each sensor, the communication path from itself to the sink node is traversed, and both the number of hops and the necessary communication energy is recorded. IA generated always the same routing tree in spite of varying overhead energy levels because it ignored the effect of overhead energy during routing tree formation phase. For the effect of other simulation parameters like node density and path loss exponent, the reader may refer to Chapter 4. 66 700 .. Ignore Average Packet Delivery Energy (mJ) 600 Consider 500 400 300 200 100 0 0 50 100 Overhead Energy (m J) Figure 5.2. Average packet delivery energy versus overhead energy In Figure 5.2, we have compared the average energy load of a packet on the network. For each data packet generated at any sensor node, the total energy dissipation on the path towards the sink node is calculated. The graph shows the average of this total energy over all sensor nodes in the network with respect to increasing overhead energy values. Since IA produced the same routing tree, the average energy dissipation increases linearly. However, CA was able to find more energy efficient routes reducing the total energy dissipation for a packet to reach to the destination. 67 150 Ignore Average Node Energy (mJ) . Consider 100 50 0 0 50 Overhead Energy (mJ) 100 Figure 5.3. Average node energy versus overhead energy 6 Average Hop Count. 5 4 3 2 1 Ignore Consider 0 0 50 100 Overhead Energy (mJ) Figure 5.4. Average hop count versus overhead energy In Figure 5.3, the energy dissipation at each sensor node is compared individually. In this graph, we can clearly see that sensor nodes spend more energy when they are connected using the routing trees found by CA. Although individual energy dissipation is higher compared to IA, we have seen in Figure 5.2 that the total energy dissipation is less. Whenever the overhead energy becomes a significant element in the energy cost, the routing algorithm prevents unnecessary hops and therefore the energy waste because of 68 overhead energy that is spent at each hop. The result can easily be seen in Figure 5.4, where the average hop count in the routing trees is compared. The larger the overhead energy that is spent at each hop is the smaller is the average number of hops in the network. Ignore Consider Network Lifetime (days) 15 10 5 0 0 50 100 Overhead Energy (mJ) Figure 5.5. Network lifetime versus overhead energy In Figure 5.5, the network lifetime is observed. It is obvious that increasing the overhead energy shortens the lifetime, since the energy dissipation at the sensor nodes becomes higher. In addition, we can observe undoubtedly that ignoring the overhead energy parameter in routing calculations result in suboptimal routing trees. As an example, consider τ = 50 mJ. The network would be alive only 3.6 days where the routing tree is constructed using IA. At the same overhead energy level, CA would create a more efficient routing tree where the lifetime would increase up to 5.5 days, with a gain of more than 50 per cent. For a larger overhead energy value with τ = 100 mJ, this gain in network lifetime is nearly 65 per cent. 5.3. Conclusions on the Effect of Overhead Energy The overhead energy is an intrinsic component of energy dissipation at sensor nodes. In this work, we have analyzed the effect of neglecting the overhead energy dissipation in 69 routing decisions. Neglecting this important factor during routing decisions may result in worse routing alternatives while promoting meaningless multi-hop communication links and resulting in a significant amount of energy waste. The network lifetime would decrease significantly if the routing algorithm does not consider overhead energy dissipation. 70 6. MULTIPLE SINK SENSOR NETWORK DESIGN PROBLEM One of the most important design criteria in wireless sensor networks is energy efficiency. The system designer should always consider the limited battery power of the sensor node at each network decision. In this chapter, several network design objectives are presented. Each case represents another point of view to the energy saving consideration. Because of the decision priorities of these objectives, each of them require a different approach to derive an optimized solution. Here, we will state design issues that might be important for large scale wireless sensor networks, including several design criteria, routing alternatives and redeployment scenarios. After that, the similarities and differences with the classical concentrator location problem are presented. Finally, our problem is stated together with the solution technique. 6.1. Design Criteria Before going into the design objectives, we have to introduce the criteria that are important in multiple sink sensor network design, such as the number of sinks, assignment of the sensor nodes to the sinks, best location of the sinks, and the underlying routing algorithm. 6.1.1. Number of Sinks Since the sink nodes are expensive devices, they should be invested economically. To explore the problem, we analyze the two extreme points. On one hand, we can use only one sink node for the whole sensor network. When the size of the network increases, however, the average length of the paths, or similarly, the number of hops from the sensors to the sink node will increase. Therefore, the energy dissipation for each packet delivery increases as well. This will result in a decrease in the network lifetime. On the other hand, we can associate one sink per sensor node, and locate these sinks very close to their associated sensor nodes. This time, the transmission energy requirement at the sensor nodes will be at the minimum, which results in the theoretical maximum sensor network lifetime. This deployment strategy is clearly not meaningful in terms of economical 71 investment. The cost of each sink node is more than the cost of a sensor node in the order of hundreds or even thousands. Therefore, the number of sink nodes is an important design criterion, which is directly dependent on the available budget reserved for the sink nodes, ∑c S s ≤ DS (6.1) s∈ S where s ∈ S is any sink node, csS is the investment cost for that sink, and DS is the budget dedicated for the total sink investment. 6.1.2. Network Lifetime Depending on the underlying application, the network lifetime can be defined differently. In a mission critical application like medical surgeries, healthcare systems, or military applications, the failure of even a single sensor might be important. The lifetime is defined in this case as the length of time until the first battery drain-out among all sensor nodes occurs [52]. T = min{ t : E j (t ) = 0, t ∈ ℜ j∈N } (6.2) where the sensor’s energy reserve E j (t ) is defined as a monotonically decreasing function of time (see Definition 3.15). For general purpose monitoring applications, the reliability of the data retrieved from the environment becomes a good metric to define the network life. For example, we can define the ratio of the uncovered area to the whole area under investigation as the reliability metric. This ratio ρ A (t ) is a monotonically increasing function, since the uncovered area increases with time. When a predefined threshold value is reached we say that the network is not reliable anymore. For critical applications, we can set this threshold value to a lower value, for insignificant applications to a larger value. 72 ρ A (t ) = uncovered area at time t ≤ ρ Threshold total area (6.3) The estimation of the uncovered area is not straightforward, as the sensing regions of the sensor nodes overlap. Therefore, some approximations can be derived for the reliability ratio. For example, the number of unreachable nodes or similarly the number of exhausted nodes can be used for the estimation as given in Equation 6.4 and Equation 6.5. ρ U (t ) = number of unreachable nodes at time t total number of nodes (6.4) number of exhausted nodes at time t total number of nodes (6.5) ρ E (t ) = 6.1.3. Routing The underlying routing method is another important decision point. The energy dissipation is closely related with the routing method. On one extreme, the application might not be using any routing method at all, sending the data packets directly to the sink nodes. This kind of communication will either require very powerful radio transmitters at the sensor nodes providing with a large radio range, or many number of sink nodes that are in the close neighborhood of the sensor nodes. A more common alternative is using a multi-hop ad hoc network infrastructure. In this case, sensor nodes forward the data packets of other sensor nodes towards the sink nodes. The routes are found considering their corresponding energy requirements. In order to handle the results of any routing algorithm, we have defined in Section 3.2.2 the path matrix given in Equation 6.6. 1 if ( j , k ) is used on the path Pi → s p isjk = 0 otherwise. where i ∈ N is an arbitrary initiator node, and s ∈ S is an arbitrary sink node. (6.6) 73 6.1.4. Cluster Members Whenever the designer has a prior knowledge on the available investment budget for the sink nodes, the number of sink nodes that will be deployed can easily be found. At this time, the clusters should be formed, indicating the sensor-to-sink assignments. The sensors within a cluster will be communicating with the sink node that is allocated for them. This cluster formation should be performed according to the energy dissipation of the sensors. The sink nodes may have an internal boundary that limits the number of connected sensors to themselves. The data that is queried from the sensors and being forwarded to the control center may be exceeding the communication capacity of the sink nodes. Therefore, the cluster size could have a practical limit as given in Equation 6.7 Bs ≤ Ks (6.7) where B s is the branch set of the sink node s ∈ S (see Definition 3.10), and K s is the service capacity of that sink. Using Lemma 3.7 we can rewrite the objective as given in Equation 6.8 ∑∑∑ p is jk p jsjs ≤ K s (6.8) j∈N i∈N k∈V where p isjk is the path matrix. 6.1.5. Location of Sinks We know that the number of the sinks must be kept as low as possible because of their investment costs. Having decided on the number of the sink nodes, we form the clusters where these sinks are going to be responsible. In the mean time, we also try to find the optimum placement for these sink nodes within each cluster. Evidently, these sink nodes should be located close to the center of the sensor nodes in each cluster. 74 Due to environmental restrictions, there might be regions where the sink nodes cannot be placed at all. Prior knowledge on such location limitations should also be incorporated to the optimization problem. Moreover, the cost of placement on some regions may be different. Therefore, this cost should also be incorporated into the optimization problem. Assume that csP ( x, y ) is a real valued regional placement cost, where x, y ∈ ℜ gives the coordinates of the location. Then the total placement cost is given in Equation 6.9. C P = ∑ csP ( xs , ys ) (6.9) s∈ S 6.1.6. Data Generation Rate The sensor nodes generate monitoring results with different rates for different applications. For example, an agricultural or environmental monitoring application might require periodic data generation, where the up-to-date status of the field should be monitored continuously. However, on a warehouse management application, the data generation could be initiated on-demand at the data center. These on-demand queries could be destined to the whole network or to just a subset of the region as well. In addition, the data generation rate in a continuous monitoring application could also be different throughout the region. When the designer has a prior knowledge on the environment, he may choose to install some specially tuned nodes, which generate more often measurements at critical places. Therefore, the sink nodes might have been installed closer to those critical places, where more data packets are generated. 6.1.7. Energy Model Efficient sensing circuitries and computation algorithms help to reduce the sensing energy and computation energy dissipations on the sensor nodes. The major energy dissipation components are transmitting and receiving energy, which is dependent on the communication architecture and underlying techniques. Therefore, power aware methods must be employed in order to reduce the energy consumption during communication. In 75 our work, we use the energy model given in Section 2.7.2. This energy model is very simple in the sense that it does not provide details on attenuation, fading, and multi-path propagation. The reader may refer to [53] if more detailed empirical and semi-empirical path loss power models are needed. 6.2. Routing Decisions 6.2.1. Minimum Energy Tree Data flow from sensor nodes to the sink nodes forms a tree architecture, where the root of the tree is the sink node. When we consider a network with multiple sink nodes, then the tree is going to be disconnected, forming a forest. Every connection in the tree will be established depending on the energy cost measurement of that link. Using these connections, minimum energy paths from every sensor node to their corresponding sink node are constructed. The collection of these paths forms the minimum energy tree. In this tree, we guarantee that each data packet reaches to a sink node with the overall minimum energy dissipation caused at the sensor nodes. The objective is to find the minimum total energy dissipation in Equation 6.10. min ETotal = ∑∑ a jk e jk (6.10) j∈N k∈V where ajk represents the elements of the adjacency matrix of the minimum energy tree (see Definition 3.5), and ejk is the energy cost of the link between the nodes j ∈ N , k ∈ V in the network (see Definition 3.3). After the deployment of the sensor network together with all the sensor nodes and the sink nodes, the minimum energy tree can be calculated according to an energy cost metric. This tree, however, may need some modifications during the lifetime of the network. The sensor nodes that are close to the sink nodes are loaded more compared to the leaf nodes, resulting in relatively higher energy consumption. The tree may be recalculated periodically, where only the nodes that have sufficient residual energy will 76 participate as relay nodes. This restructuring can also be triggered on demand, whenever a large area cannot be monitored due to some node failure. Moreover, we can also deploy additional nodes in the neighborhood of the failed sensor nodes, or even replace these nodes, whenever the underlying application allows. 6.2.2. Minimize the Maximum Energy Dissipation at Sensor Nodes The energy dissipation at a sensor node depends on the activities performed by this node. In our model, these activities include generating data packets and transmitting them towards the sink node, and forwarding data packets that are generated by other sensor nodes towards the sink node (see Section 2.7.2). When we try to minimize the energy dissipation at each node, the resulting connections forms a minimum energy tree, as we have seen above. In order to establish a fair energy controlling mechanism on the network, we may rather try to spread the load over the network. A good mechanism to formulate this objective is to try to minimize the maximum energy dissipation at each sensor node in the network. Therefore, whenever a node has a large relaying load causing more energy dissipation, some of the load is going to be transferred to some neighboring sensor node, where the relaying load is less. The objective is given in Equation 6.11 min max { e j j∈N } (6.11) where e j is the relay energy load of a sensor node j ∈ N (see Definition 3.12). Using Lemma 3.9 we can rewrite the objective as given in Equation 6.12 min max ∑∑∑ p isjk e jk j∈N s∈S i∈N k∈V (6.12) where p isjk is the path matrix, and ejk is the energy cost. Whenever the packet generation rate of the sensor nodes are known, we can incorporate it into the objective to minimize 77 the maximum energy dissipation during a given time period. Then the previous objective will be modified to reflect the time parameter as given in Equation 6.13 min max { e j (t ) } j∈N (6.13) where e j (t ) is the total energy dissipation of a sensor node j ∈ N during the time period t (see Definition 3.15). Using Corollary 3.6 we can rewrite the objective as given in Equation 6.14 min max ∑∑∑ (1 µi ) p isjk e jk j∈ N s∈S i∈N k ∈V (6.14) where µ i is the average packet interarrival time (see Definition 3.14), p isjk is the path matrix, and ejk is the energy cost. This mechanism will prolong the network lifetime, while spreading the energy load over the whole network. In [52], a similar approach is presented where the lifetime of each individual sensor nodes is tried to be maximized, considering the data flow in the network. A major disadvantage of this method is, however, the batteries of every sensor node are going to be exhausted in a similar manner. Therefore, the idea of redeploying additional sensor nodes will not be economically feasible. 6.2.3. Minimize the Maximum Energy Path Another method to spread the energy load over the whole network could be to eliminate long paths on the routing tree. As a result, each sensor packet could reach to the sink nodes through paths where the total energy dissipation for the packet to reach to a sink node is the objective function. In this case, we try to minimize the maximum total energy dissipation for a data packet as given in Equation 6.15 78 min max { ei → s i∈N , s∈S } (6.15) where ei→s is the total energy dissipation for a data packet from an arbitrary initiator node i ∈ N arriving at a sink node s ∈ S . (see Definition 3.12). Using Lemma 3.8 we can rewrite the objective as given in Equation 6.16 min max ∑∑ p isjk e jk i∈N , s∈S j∈N k∈V (6.16) where p isjk is the path matrix, and ejk is the energy cost. 6.2.4. Maximum Residual Energy Path Whenever the routing tree in the sensor network can be modified several times during the lifetime, then the residual energy at the sensor nodes can also be used in routing decisions. The idea is to use only those sensor nodes as relay nodes, which have more energy compared to the other nodes. The objective is then to maximize the minimum residual energy path in the routing tree. However, this objective is contradicting with the minimum energy tree objective. The nodes that are on the minimum energy tree are going to be utilized more, where they cannot be used in the maximum residual energy path. Therefore, this path will be more costly in terms of energy dissipation at the relay nodes. The idea of considering the residual energy in routing decisions could be used as a constraint, where only the nodes with sufficient energy are considered as relay candidates. 6.3. Redeployment Scenarios The network lifetime is closely related with the number of sensor nodes whose batteries are exhausted. Therefore, in order to prolong the network lifetime, the investor might choose some redeployment mechanism, whenever the underlying application does 79 not physically prevent it. After the redeployment phase, routing paths should be recalculated. There are several alternatives for the redeployment. 6.3.1. Random Redeployment The sensor nodes could again be installed without any selectivity. The whole region will be under consideration. The number of new sensor nodes will be a design parameter, where easily a percentage of the initial sensor nodes could be used. 6.3.2. Neighborhood Redeployment The new sensor nodes could be installed in the close neighborhood of the sensor nodes having an energy shortage. Here, both the “shortage” and the “neighborhood” should be defined before the redeployment phase. Moreover, the number of new sensor nodes should also be clarified. 6.3.3. Replacement In this scenario, the sensor nodes, or only the batteries of the sensor nodes are replaced one by one. This replacement could be performed periodically, considering nodes that are “close” to be exhausted, or similarly it could be on demand, whenever a node fails. The feasible replacement period and the level of battery shortage should be clarified. 6.3.4. Redundant Deployment Whenever the redeployment of additional sensors is not possible because of environmental constraints or high redeployment cost, the designer might choose to deploy redundant sensor nodes, which stay idle until any neighboring sensor dies out. In this case, the designer should control the deployment of these additional nodes while forecasting the regions where an energy shortage will occur. 80 6.4. Sink Location Problems 6.4.1. Find the Best Sink Locations (BSL) In many real world deployment scenarios, the designer will have a predefined budget granted for the investment. Therefore, the number of sink nodes is known prior to the deployment phase. Since we know the number of sink nodes, which represents the number of clusters in the network, the only problem remains is the efficient clustering of these sensor nodes. We call this problem as finding the “Best Sink Locations” problem (BSL). There are many good clustering algorithms in the literature. Several techniques are presented in [54], and [55]. Most commonly used clustering algorithms are classified as hierarchical and non-hierarchical methods. The non-hierarchical methods are usually referred to as k-means clustering. For an implementation of this algorithm, the reader may refer to [ 56 ]. Another generic method is the self-organizing maps, which is a general-purpose unsupervised learning algorithm [57, 58]. None of these methods provides the optimum number of clusters that should be formed. The number of clusters should be given as a decision parameter to the algorithms. Exact location of the sink nodes are easily found when the clustering algorithm completes. Whenever the Euclidean distance is used as the clustering metric, then the center of mass of the nodes within a cluster would give the location of the sink nodes. Depending on the priorities of the routing algorithm, power aware distance metrics, like Equation 2.6, could also be used. 6.4.2. Minimize the Number of Sinks for a Predefined Minimum Operation Period (MSPOP) In some applications, the investor might request the sensor network be operational for a predefined duration. For example, in agricultural applications, the field must be monitored until the harvest. Therefore, the sensor network should be reliable until the crops grow up, and are reaped from the field. The farmer is definitely going to deploy a 81 new sensor network in the next season. We call this problem as “Minimization of the number of Sink nodes for a Predefined minimum Operation Period” (MSPOP). In order to solve this problem, we have to calculate the sensor network lifetime for any number of sink nodes. Then, only the solution will be selected, where the network lifetime exceeds the predefined limiting constraint, with the minimum number of sink nodes. The major issue in this problem is to select the correct number of sink nodes. The brute force technique is to start with only one sink node, as stated in [54]. While incrementing the number of sink nodes by one, the network lifetime is evaluated. The search will stop, whenever the desired lifetime is reached. This incremental search might be too long, when the actual number of required sink nodes is large. In this case, the search might start at any point where the designer has a prior knowledge, where the result nearly lies. Similarly, a binary search technique could also be used. Starting initially from one, the number of sink nodes are doubled until a feasible solution is reached. Thereafter, the solution space is narrowed down, by halving the intervals, until the minimum number is found. 6.4.3. Find the Minimum Number of Sinks while Maximizing the Network Life (MSMNL) We may also try to extend the network lifetime as much as possible with the most economical investment. We call this problem as “Minimization of the number of Sink nodes while Maximizing the Network Lifetime” (MSMNL). When we do not have a prior knowledge on the number of sink nodes nor the lifetime constraint, then we bring up a combinatorial optimization problem. In this case, the objective should combine the two alternatives within one function, where the budget reserved for the sink nodes should somehow be connected with the lifetime of the sensor nodes. The initial investment for the sensor nodes should be utilized for the longest period. Here, the “cost per unit time” metric could be used. In order to 82 reach the required timeframe, we may need to perform some redeployment within the network. We should add the cost of the redeployment to the initial investment cost, which includes both the sensor and the sink nodes. The near-optimal solution for this problem could be found using some heuristic techniques, like simulated annealing or genetic algorithms. 6.5. Differences with Concentrator Location Problem The problem of finding the number and the location of the sink nodes resembles some classical problems like plant location problem [59], warehouse location problem [60], and the concentrator location problem (CLP), which has been received a great interest in the literature [61-67]. There are, however, several differences between these problems and the multiple sink location problem. In these problems, although the locations of the central nodes are unknown, a list of possible locations is given. Locating the central node to these places has a varying cost, which is combined with the cost of the central node, hindering the designer from choosing many number of them. In our problem, however, the sinks can be placed everywhere in the environment, without having a prior knowledge. Therefore, enumeration of the location alternatives is not possible. The communication between the sensor nodes and the sink nodes are performed using multi-hop links, which are going through other sensor nodes in the network, whereas in CLP, the connections are through direct links. In addition, all these intermediate sensor nodes should also be connected to the same sink node. The capacity of the central nodes is a constraint in CLP. Direct connections to the central nodes cannot exceed their capacity. In our problem, the sink nodes also have a predefined capacity, as the sensor nodes are communicating with the sink nodes through wireless links. However, the capacity is only related with the size of the data processing power of the sink nodes, rather than the direct communication links between the sensor nodes and the sink node. Using multi-hop connections, the number of sensor nodes that are actually connected to any sink node is more than the number of direct connections. 83 6.6. A Solution Technique for the MSPOP Problem After discussing the issues related to the multiple sink sensor network design and listing alternative problems, in this section, we will consider particularly the MSPOP problem and propose a solution technique. 1. Deploy the sensor nodes 2. Wait until the sensor nodes find their location information 3. Collect the location information from the field 4. k = 0 5. Repeat i. k = k + 1 ii. Find the best location for k sink nodes iii. Estimate the network lifetime 6. Until required network lifetime is reached 7. Output the sink locations, and the estimated lifetime Figure 6.1. System design algorithm This problem is by nature an off-line problem, which should be solved by the system designer at a central location. Therefore, the location information of the sensor nodes should be collected from the field, before the solution phase. The algorithm of the system design phase is given in Figure 6.1. In the following sections, each step of this algorithm is explained in detail. 6.6.1. Deployment of the Sensor Nodes Depending on the underlying application, there may be several alternatives on sensor network deployment. In an agricultural application, they might be scattered by the farmer, in a nearly uniform manner. In an environmental application like forest fire detection, they might be dropped from an aircraft. In in-house applications, they might be installed by the construction workers by hand. Nevertheless, the clustering algorithm that is used in Step 5.ii in Figure 6.1 should be able to deal with these deployment scenarios. In addition, 84 we also might be using several different clustering algorithms, each one optimized for another deployment type. 6.6.2. Finding Location Information In order to calculate sink locations, we must know the location of each individual sensor node. Location information can easily be derived, using central or distributed methods (see Section 2.3). If we use a central approach, we can continue our algorithm with Step 3 in Figure 6.1, where the central agent can perform an off-line location estimation. The energy dissipation at the sensor nodes for the location finding process is not considered in this problem. 6.6.3. Collecting the Location Information from the Field After the deployment of the sensor nodes, we can use a mobile terminal to collect the location data, which traverses the field. If we do not have such a special purpose terminal, then we have to install a sink node temporarily. We can take this sink node back during final deployment phase. When this is not possible due to physical limitations of the application, then we can choose to use this node as a fixed sink node in the final deployment, and consider this node in the clustering algorithm, without moving it. 6.6.4. Finding the Best Location for K Sink Nodes If the number of sink nodes is known, then we have the problem defined in Section 6.4.1. The clustering algorithm is responsible to locate the sink nodes successfully. Depending on the deployment distribution, we can choose different clustering algorithms at this step. In our implementation, we have used the well known k-means clustering algorithm [55, 56]. 6.6.5. Estimating the Network Lifetime In order to fulfill the time constraint, we have to estimate the network lifetime, with the sink locations proposed at Step 5.ii in Figure 6.1. The details on network lifetime are 85 given in Section 6.1.2, where we have defined several reliability metrics. In our implementation, we have used the ratio in Equation 6.4. We use a simulator to find out when the predefined reliability threshold is exceeded for the estimation of the lifetime. This simulator should consider the packet generation rate, which is an application criterion, sensor node’s hardware specifications, like the battery capacity, communication data rate, overhead energy, transmitter power requirements, and environmental characteristics, like the path loss exponent. 6.7. Computational Experiments on Multiple Sink Sensor Network Problems In this section, we try to give two examples of multiple sink sensor network design problems. First, we will provide a demonstrative example for the BSL problem on a sample sensor network with three sink nodes. After that, we will consider the MSPOP problem and show the application of the solution technique described in Section 6.6 with several sensor network deployments. 6.7.1. Simulation Setup In our simulations, we only use sensor nodes having power adjustable transmitter circuitry, Pcont nodes. These nodes have a power level adjustment capability on a continuous scale. In simulations, however, we have used 20 discrete power levels instead, to approximate continuity. In Chapter 4, we have shown that using this type of circuitry, we can easily double the lifetime of each individual sensor node. Here, we focus on the number of sink nodes, and try to derive a relation between the network lifetime. The simulations are performed on the test bed developed in Chapter 4, using Opnet Modeler [49]. The sample sensor network consists of 200 nodes, which are distributed uniformly over a planar square region with 200 m x 200 m dimensions. Although the amount of nodes and the area size are rather small, the techniques can be applied easily to larger networks both in area size and node quantities. 86 The sensors are assumed to use 800 mW transmission power for a 200 m radio range in open air ( α = 2 ). These values are chosen, as they are very close to the Berkeley/Crossbow Mica Motes’ specifications [8]. However, we have scaled the radio range for our simulation environment where we used a constant path loss exponent value with α = 3 . For the effect of different path loss exponent values on energy dissipation, the reader may refer to Chapter 4. Here, the overhead energy τ is chosen to be 20 mJ per packet, where 400 mW receiver power is assumed. Here, both sensing and computation energy are neglected, since they do not affect our design decisions. The data rate of the communication channel is chosen to be 20 kbps, and a fixed packet size of 1024 bits is used. These simulation assumptions are summarized in Table 6.1. Table 6.1. Simulation parameters Parameter Value Sample transmission power 800 mW Sample transmission range 200 m Data rate Packet size 20 kbps 1024 bits Minimum transmission power 100 mW Maximum transmission power 2,000 mW Initial battery capacity Simulation time Area size (A) Path loss exponent (α) 200 J 60 days 200 m x 200 m 3 Number of sensor nodes 200 Reliability threshold (ρ) 0.25 87 The lifetime is defined to be related with the number of unreachable sensors in the network. We consider the readings of the network to be unreliable anymore, whenever the ratio of the unreachable sensors exceeds this threshold value, with ρ = 0.25 . The initial battery capacity of the sensors is chosen to be 200 J. In [51], it is given for an alkaline-manganese dioxide battery that the typical volumetric energy density is 428 watt hour per liter. In other words, a battery of size one cubic centimeter would have the capacity 1540 J. However, we have chosen the smaller value to shorten the simulation time. The behavior of the simulations will not change, since the battery capacity only causes the results to appear earlier. As a consequence, we have also chosen a short simulation time, 60 days. The sensors are assumed to perform independent readings, and therefore independent packet generations. The packet generation process is assumed to be a Poisson process with rate λ = 1 packets per hour, where we assume a continuous monitoring application. Nevertheless, here a periodic process could also be chosen where the sensors are polled with a predefined frequency. 6.7.2. Demonstrative Example for the BSL Problem In this problem, we are given a sensor network. We try to locate three sink nodes accordingly to reach the maximum operation time. The network is shown in Figure 6.2. In this figure, a solution for three sinks is also presented, where the location of the sink nodes are marked with small triangles ( ). The locations of the sink nodes are found using the well known k-means clustering algorithm [55, 56]. After the deployment phase, we tried to estimate the network lifetime. For this, we constructed the routing tree, where the minimum energy tree approach is used, with the energy metric given in Equation 2.6. Then, we monitored the energy map and the disconnected region map of the network. The energy map indicates the regions, where an energy shortage occurs, using iso-energy contours. The darker the contours are, the less energy resource is present in than region. After a period, some nodes start to become exhausted, and therefore all the nodes in the branch set of this node will be disconnected 88 (see Definition 3.10 for the definition of branch set). The disconnected region map shows these disconnected nodes due to energy failures. Figure 6.2. Sample sensor network with 200 sensors and three sinks In Figure 6.4, we show a series of energy and disconnected region maps, where we have taken snapshots of the sensor network once every 10 days. The maps on the left side are the energy maps, and the ones on the right side are corresponding disconnected region maps. In the energy maps, the darkness of the regions represents the energy shortage at those nodes. In the disconnected region maps, shaded regions represent unreachable regions in the network. We observe here, that the disconnected region increases, as time goes by. This behavior is expected, since the energy reserve of the sensors decrease during operation. Energy shortage occurs at nodes, who serve a large branch as a relay node. None of the leaf nodes encounters an energy problem, however, some of them cannot reach the sink nodes as they become disconnected because of failures in the relay nodes. We observe early failures at sensors that are close to the sink nodes, since they serve a larger branch set. 89 (a) Energy map, after day 10 (b) Disc. region map, after day 10 (c) Energy map, after day 20 (d) Disc. region map, after day 20 (e) Energy map, after day 30 (f) Disc. region map, after day 30 Figure 6.3. Energy and disconnected region maps, until the 60th day 90 (g) Energy map, after day 40 (h) Disc. region map, after day 40 (i) Energy map, after day 50 (j) Disc. region map, after day 50 (k) Energy map, after day 60 (l) Disc. region map, after day 60 Figure 6.4. Energy and disconnected region maps, until the 60th day (continued) 91 Figure 6.5. Exhausted nodes versus time Figure 6.6. Unreachable nodes versus time In Figure 6.5, we show the increase on the number of exhausted nodes. We observe the first failure at the 15th day. The failed node has 12 nodes in its branch set. Therefore, the first failure causes 11 other functional nodes to become disconnected. The number of unreachable nodes can be seen in Figure 6.6. Using this figure, we can state that the 92 network produces reliable readings only for 24 days, where we pass the threshold for the number of disconnected nodes, with ρ = 0.25 . Figure 6.7. Unreachable nodes versus time using rerouting Figure 6.8. Exhausted nodes versus time using rerouting 93 In Figure 6.4, we observe that not every sensor node that is close to sink nodes fail, since these nodes may have a very small number of sensor nodes to serve. Therefore, we can expect that reconstructing the minimum energy tree after energy failures could prolong the network lifetime. In order to verify this expectation, we have modified the monitoring process. Whenever the reliability threshold is reached, the sink nodes initiate a new route finding process, so that exhausted nodes could be removed from the routing tree, and all other disconnected nodes have another chance to be connected to the sink nodes through some other intermediate nodes. The results are shown in Figure 6.7 and Figure 6.8. The number of unreachable nodes are increasing up to the reliability threshold, with ρ = 0.25 . At that point, a new route recalculation process is initiated and the network survives until the next route recalculation phase. These recalculations take place with an increasing frequency, since the remaining battery power on the intermediate sensor nodes decreases with time. Finally, route recalculations cannot build a reliable network anymore, and the network dies. The first recalculation takes place at the 24th day, whereas using successive recalculations, we can utilize the network until the 55th day, which increases the lifetime more than double. 6.7.3. Application of the Solution Technique to the MSPOP Problem In this section, we perform simulations on multiple, uniformly deployed sensor networks, and try show the application of the solution technique for the MSPOP problem. We start with one sink node, and increase the number of sink nodes by one at every step, where the location of the sink nodes are found using the k-means clustering algorithm [55, 56]. After each deployment alternative, we tried to estimate the network lifetime. In Figure 6.9, we show the percentage of exhausted nodes, where the number of sink nodes is varying from one up to six. The cluster size decreases with increasing number of sink nodes. Therefore, the paths from each sensor node to the sink nodes will be shorter in the resulting minimum energy tree. As a result, energy dissipation due to packet relays decreases, therefore the percentage of exhausted nodes decrease. In Figure 6.10, the percentage of unreachable nodes is shown. When the number of sink nodes is small, then the percentage increases very rapidly within the few days. Then, 94 only the nodes that are close to the sink nodes survive. Moreover, since these nodes have a relatively less relaying load, their batteries are exhausting rather slow. Figure 6.9. Percentage of exhausted nodes versus time, with different number of sinks Figure 6.10. Percentage of unreachable nodes versus time, with different number of sinks 95 Table 6.2. Expected network lifetime, with ρ = 0.25 Number of sink nodes Lifetime (days) 1 7 2 12 3 24 4 40 5 51 6 > 60 As the solution of our problem with ρ = 0.25 , we have to be given a minimum desired network lifetime. If this period is required to be one month, then it is clear that we have to utilize four sink nodes. When this period is required to be at least two months, then the number of required sink nodes increases to six. Table 6.2 summarizes the expected network lifetime for each number of sink nodes. Figure 6.11. Comparison of random placement with k-means algorithm, with three sinks 96 With a further test, shown in Figure 6.11, we have checked the quality of the results of k-means clustering algorithm. For the network where we have three sink nodes, we have randomly located these nodes, and compared the percentage of the unreachable nodes with respect to time. We have seen a crucial decline in the results. The network lifetime, with ρ = 0.25 , decreases to 13 days, compared to 24 days for k-means clustering. Therefore, we can conclude that a smart sink location and clustering algorithm significantly improves the network lifetime. Although k-means algorithm uses the simple Euclidean distance metric to form the clusters, it generates high quality results compared to random sink location. When we can introduce energy-aware cost metrics into our clustering algorithms, we should expect results that are more powerful. Figure 6.12. Change in the number of sinks for different network lifetime requirements In Figure 6.12, we have analyzed the change in the necessary number of sink nodes where we have increased the number of deployed sensor nodes and the area of the monitored region, while keeping the sensor density constant. We have computed the necessary number of sink nodes for three different network lifetime requirements with 10, 20 and 30 days. As we also learned from previous experiments, we have seen that higher network lifetime requirements forces more sink nodes to be deployed. Moreover, we see 97 that when we have a larger area, then we have to deploy more sink modes in order to keep the network alive. The most interesting result is that, independent of the required network lifetime, the relation between the number of sensor nodes and the number of necessary sink nodes is found to be linear, where the slope of the line is related with the underlying application’s design criteria like the network lifetime requirement, data generation rate and model, network reliability threshold ρ, etc. 6.7.4. Conclusion for the Computational Experiments In order to maximize the lifetime of a sensor network, energy resources of each individual sensor node must be consumed effectively. In large-scale sensor networks, the network must be divided into smaller sub-networks, not only to increase manageability of the network, but also to increase the network lifetime. We have introduced the multiple sink network design problem, where the best places for the sink nodes should be calculated depending on several different design criteria. We have listed these design issues together with their formulations. We have discussed several routing alternatives, redeployment scenarios, and sink location problems. We have demonstrated a sample sink location case, where the number of sink nodes was known before the deployment phase. We have implemented the solution for this BSL problem, and presented the corresponding energy and disconnected region maps on a sample sensor network for different snapshots in time. We have observed how the disconnected region increases with time. We have encountered with failures at the sensor nodes that are close to the sink nodes, because these nodes have served a larger branch set. However, not every sensor node that is close to sink nodes fail early, since these nodes might coincidentally have a very small number of sensor nodes to serve. We have seen that reconstruction of the minimum energy tree after energy failures occur prolongs the network lifetime significantly. We have also proposed a solution technique for the MSPOP problem, and simulated its implementation on random sensor networks. We have analyzed the effect of adding new sinks to the network lifetime. We have presented methods to deploy economically feasible amount of sink nodes while prolonging the network lifetime. We have shown that 98 k-means clustering algorithm provides good sink locations. In addition, we have seen that the relation between the number of sensor nodes and the number of sink nodes is linear. 99 7. CONCLUSION AND FUTURE WORK 7.1. Conclusion One of the most important design criterions in wireless sensor networks is the energy constraint in sensor nodes. In order to maximize the network lifetime, energy resources of each individual sensor node must be managed effectively. In this thesis, we try to develop techniques for controlling the total network lifetime. First, we have derived new formulations for the multiple sink sensor network design problem. Important characteristics of the sensor network infrastructure have been analyzed and the necessary definitions have been introduced. These definitions and formulations can be used in any sensor network research since it provides a framework that is independent of the underlying routing algorithm. We have shown that using multi-hop paths that consist of shorter links instead of one long link might result in considerable energy gain. When the sensor nodes are equipped with transmitter circuitry that can adjust its output level, the sensor lifetime can easily be doubled on the average. We have derived techniques, where multi-hop links are more advantageous. We have seen that the degree of hopping should decrease whenever higher overhead energy values are under consideration. We have studied different multi-hop communication scenarios and calculated the energy saving in each scenario. We have also expanded these scenarios to general cases. The generalization can be applied into any arbitrary triangle and can be used in energy optimized route calculations. We have analyzed the effect of overhead energy in routing decisions, which is an intrinsic component of energy dissipation at the sensor nodes. We have seen that neglecting this important factor during routing decisions may result in worse routing alternatives while promoting meaningless multi-hop communication links and causing a significant amount of energy waste. The network lifetime would decrease significantly if the routing algorithm does not consider overhead energy dissipation. 100 Finally, we have considered locating multiple sink nodes to the sensor environment and dividing the network into smaller sub-networks. We have introduced related problems, where the best places for the sink nodes should be calculated depending on several different design criteria. We have listed these design issues together with their formulations. We have discussed several routing alternatives, redeployment scenarios, and sink location problems. Using a demonstrative example, we have shown the solution for the BSL problem, and presented the corresponding energy and disconnected region maps on a sample sensor network for different snapshots in time. We have observed how the disconnected region increases with time. We have also proposed a solution technique for the MSPOP problem, and simulated its implementation on random sensor networks. We have analyzed the effect of adding new sinks to the network lifetime. We have presented methods to deploy economically feasible amount of sink nodes while prolonging the network lifetime. We have seen that the relation between the number of sensor nodes and the number of sink nodes is linear. 7.2. Future Work There are several points that might be considered as a future research direction. The sensors might have a maximum load limit, which will give the number of sensor nodes that are within their branch set. Using this constraint, premature battery exhaustions might be prevented. The investor can choose some redeployment during the operation of the sensor network. After the redeployment, the routing tree should be recalculated. The effect of these redeployments should be analyzed in terms of the network lifetime. The efficiency of the clustering algorithms under different sensor deployments should also be analyzed. When the sensors are deployed non-uniformly rather than being uniform, special clustering algorithms might produce better results. 101 APPENDIX A: OPNET IMPLEMENTATION DETAILS Opnet Modeler is a discrete-event network simulator commercially available since 1987 [49]. Using its process and node modeling capability, and its interface with open C/C++ code library, real-world network scenarios can easily be implemented. In the following sections, wireless sensor network implementation details are presented. A.1. Wireless Sensor Network In Figure A.1, a sample scenario is shown. In this simple network, three different nodes are used, sensor nodes, sink nodes and a configuration node. Sensor nodes are labeled starting from “1” to “16”, sink nodes are labeled as “91” and “92”, finally the configuration node is labeled as “conf”. Figure A.1. Sample wireless sensor network scenario In this scenario, the nodes are scattered randomly to a region of size 1000 m x 1000 m. Although the node placement here is performed manually, the 102 configuration step can scatter the nodes to the area according to a given probability distribution. A.2. Node Model Both the sensor and the sink nodes are of identical structure, each with the following node model (see Figure A.2). The node model contains building blocks, which represent individual processes, and connection links, which determine the data flow between the building blocks. The data flow occurs mainly through packet transfer. Each process is represented using a finite state machine (FSM), where each transition between states occurs mainly through interrupts. Figure A.2. Sensor node model The model is based on the ISO OSI 7 layer architecture [50], which consists of an antenna interface, radio transmitter and receiver, data link layer, network layer, and the 103 application layer. The application layer summarizes all the tasks of the upper four layers, as their functionality are not a part of this thesis. Sensor devices are coupled with wireless communication capability through the antenna interface. This interface can be modified to present any directional antenna pattern. However, an isotropic antenna pattern is used in our simulations. An isotropic pattern radiates (or captures) power equally in all directions. For more detail, the reader can refer to [49]. The antenna interface is connected to the radio transmitter and radio receiver interfaces. These interfaces are providing the nodes with wireless communication links. For each communication, Opnet creates a link from the source node to the destination nodes. All the nodes that are in the range of the source node can receive the data packets simultaneously. Different simulation parameters can be assigned to the radio interface representing the wireless medium, such as background noise, interference noise, signal-to-noise ratio, bit error rate, and the radio characteristics, such as error control codes, power model, and modulation type, communication frequency and data rate. Data link layer controls the radio interface, and propagates data packets from network layer to the radio interface. In addition, data packets retrieved from the radio receiver are forwarded to the network layer. The network layer controls all the routing operations. Data packets that are retrieved from other sensor nodes are forwarded towards a sink node. If a multi-hop connection is established between the sensor node and the sink node, then the next-hop sensor node is destined. Application layer consists of two parts, the source and the sink. The source part mimics the behavior of the sensor. The sensor is assumed to extract information from the environment. This information is thereafter encapsulated within a data packet and forwarded to the network layer, which is responsible for successful delivery of this packet to a sink node. Sensors can create packets with any probability distribution, which is also a simulation parameter. The sink part may be configured to receive any application specific 104 setup parameters, such as sensing behavior or battery control strategy, it can also generate a response for queries initiated by the control center. In this implementation, the sink part only collects the necessary simulation statistics. A.3. Network Layer Process Model The network layer is the interface between the application layer and the data link layer. This layer has the following major tasks: (i) Forwarding data packets generated by the sensor unit to the network (ii) Accepting the setup packets from the network, and delivering it to the internal processor (sink). (iii) Controls all the routing operations, forwarding data packets that are retrieved from other sensor nodes towards a sink node. Figure A.3. Process diagram for the network layer The FSM for this operation is shown in Figure A.3. The FSM starts with the big arrow, pointing the wait state. This state performs a delay until the other processes in the simulation are initialized. Thereafter, the initialization step follows with the init state, where the control variables are loaded. Then the process enters the idle state and waits here 105 until a specific interrupt arrives. The packet interrupts are SRC_ARRVL and DLL_ARRVL, which represent the packet arrivals from the source and the data link layer respectively. A third interrupt is RT_DISC, which represents a route discovery request, mainly generated by a sink node in the network. Sensor nodes, however, have also the ability to initiate route discovery process, in order to initiate incremental route updates. Depending on the interrupt type, the FSM enters into three different states: (i) xmt state: When a packet from the source process arrives to the network layer, then the SRC_ARRVL interrupt is triggered. Then the FSM enters to the xmt state, where the packet is forwarded to the data link layer. During this transaction, the packet header is filled with the necessary information, such as the next-hop id to reach to the sink node. (ii) route state: When a packet from the data link layer arrives, the process enters to the route state, after having received the DLL_ARRVL interrupt. Here, mainly another sensor node’s packet is forwarded towards the sink node, by using the next-hop sensor node. When the packet is found to be destined to itself, then the packet is sent to the sink process. Here, the necessary setup parameters may be extracted or any query orders may be obeyed. (iii) discover state: Route discovery requests arrive through the interrupt RT_DISC. Then, the discovery packets are forwarded to neighboring sensor nodes depending on the routing algorithm. The FSM can also enter into this state, when another route discovery request arrives to the sensor node from a different node in the network. Any other interrupt that is arriving to the FSM while it is in the idle state is ignored, and the FSM stays in the same state until an expected interrupt comes. Having processed these interrupts, the FSM returns back to the idle state, where waits for a further interrupt. A.4. Data Link Layer Process Model Data link layer connects the network layer with the physical layer where the radio transmitter and receiver are located. Corresponding FSM is shown in Figure A.4. The actual functionality of the data link layer together with the MAC layer is not implemented here, because the results that the simulations are examining were independent of this 106 functionality. Anyone, who seeks for a detailed data link layer or MAC layer functionality, however, can easily replace this layer with a detailed implementation. Figure A.4. Process diagram for the data link layer After the two delay states wait_0 and wait_1, where the FSM waits until other processes to be initialized, the config state follows. Here, again the initial configuration parameters are loaded that are necessary for the simulation. Then the FSM enters the idle state, where it waits until a specific interrupt arrives. Here, only new packet arrivals create an interrupt for this process. Therefore, any other interrupt is ignored and the FSM stays in the idle state, when an unexpected interrupt arrives. The packet interrupts are NWL_ARRVL and RX_ARRVL, which represent the packet arrivals from the network layer and the radio receiver respectively. Depending on the interrupt type, the FSM enters into two different states: (i) xmt state: When a packet from the network layer arrives to the data link layer, then the NWL_ARRVL interrupt is triggered. Then the FSM enters to the xmt state, where the packet is forwarded to the radio transmitter. (ii) up state: When a packet from the radio receiver arrives, the process enters to the route state, after having received the RX_ARRVL interrupt. The packet is then forwarded to the network layer, where a decision on the next-hop destination of this 107 packet is made. A.5. Packet Structure The sensor nodes are communicating using predefined, fixed length data packets. The size of these packets is 1024 bits. The packet has the structure shown in Table A.1. Table A.1. Packet structure Field Explanation Length (bits) Data type nxt Next-hop id 32 integer src Current source id 32 integer posx Current source x-coordinate 64 double posy Current source y-coordinate 64 double power Current transmission power 64 double e_tot Total energy from a sink node 64 double typ Packet type data Payload 8 char 696 string The header information is kept as flexible as possible, in order to make simulation programming easier. In a real sensor network implementation, however, this part should be kept more concise. • nxt field: This field holds the network identifier of the next-hop node. The packet is actually destined to this node, where all other nodes overhearing this packet should ignore and discard it. The value of this field is updated at the network layer, where routing process is implemented. The node address “0” is used as the broadcast address, and the packets with this next-hop id are accepted by all nodes. 108 • src field: The packet is sent from the node with the network identifier shown in this field. Each time when the packet is relayed by an intermediate node, the value is updated. • posx field: This field holds the x-coordinate of the node. The value is used for geographical distance measurements. Again, each time when the packet is relayed by an intermediate node, the value is updated showing the new x-coordinate or the relay node. • posy field: This field holds the y-coordinate of the node and its usage is similar to the posx field. • power field: The transmission power used at the radio transmitter is written into this field. For an actual implementation of the power model in the radio interface, this field may easily be dropped. The value of this field is used to determine whether the received packet could actually be heard by the receiver node. Whenever the transmitter power is not sufficient, the packet is dropped. • e_tot field: This field holds the total energy required to reach to a sink node. The value is used to establish the minimum energy path from each sensor node to the sink nodes. As this field is only used for route calculations, it could also be written into the data field for route request packets. • typ field: The packet type is hold in this field. Currently only two packet types are implemented. These are the route discovery packets and the ordinary data packets. • data field: This field is used as a filler, and holds actually no data during the simulations. In real-world scenarios, however, the data field is going to hold the sensor readings. Depending on the application, the sensor readings of all sensor nodes in a path from a leaf node of the minimum cost tree to a sink node could be merged into a single data packet. Whenever individual readings are important, a long data field like in this implementation will be required. 109 REFERENCES 1. Pottie, G. J and W. J. Kaiser, “Wireless Integrated Network Sensors,” Communications of the ACM, Vol. 43, No. 5, pp. 51-58, May 2000. 2. Kahn, J. M., R. H. Katz and K. S. J. Pister, “Next Century Challenges: Mobile Networking for Smart Dust,” Proceedings of the Fifth Annual ACM/IEEE International Conference on Mobile Computing and Networking (MOBICOM ’99), pp. 271-278, Seattle, Washington, USA, August 15-20, 1999. 3. Iyengar, S. S., L. Prasad and H. Min, Advances in Distributed Sensor Integration: Application And Theory, Prentice-Hall, 1995. 4. Qi, H., S. S. Iyengar and K. Chakrabarty, “Distributed Sensor Networks-A Review of Recent Research,” Journal of the Franklin Institute, No. 338, pp. 655-668, 2001. 5. Ganesan, D., R. Govindan, S. Shenker and D. Estrin, “Wireless Sensor Networks,” ACM Mobile Computing and Communications Review, Vol. 5, No. 4, October 2001. 6. Akyıldız, İ. F., W. Su, Y. Sankarasubramaniam and E. Çayırcı, “Wireless Sensor Networks: A Survey,” Computer Networks, No. 38, pp. 393-422, 2002. 7. Yuen, W. H. and C. W. Sung, “On Energy Efficiency and Network Connectivity of Mobile Ad Hoc Networks,” Proceedings of the 23rd International Conference on Distributed Computing Systems (ICDCS 2003), Providence, Rhode Island, USA, May 19-22, 2003. 8. Crossbow Technology, Inc., Wireless Sensor Networks, TinyOS, Berkeley Motes from Crossbow, http://www.xbow.com/Products/Wireless_Sensor_Networks.htm, 2004. 9. Pister, K., Smart Dust, http://robotics.eecs.berkeley.edu/~pister/SmartDust/, 2003. 110 10. Tilak, S., N. B. Abu-Ghazaleh and W. Heinzelman, “Infrastructure Tradeoffs for Sensor Networks,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 49-58, Atlanta, September 28, 2002. 11. Clouqueur, T., V. Phipatanasuphorn, P. Ramanathan and K. K. Saluja, “Sensor Deployment Strategy for Target Detection,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 42-48, Atlanta, September 28, 2002. 12. Guibas, L. J., “Sensing, Tracking, and Reasoning with Relations,” IEEE Signal Processing Magazine, Vol. 19, No. 2, pp. 73-84, March 2002. 13. Braginsky, D. and D. Estrin, “Rumor Routing Algorithm for Sensor Networks,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 22-31, Atlanta, September 28, 2002. 14. Bandyopadhyay, S. and E. J. Coyle, “An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks,” Proceedings of the 22st Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2003), San Francisco, CA, USA, April 1-3, 2003. 15. Kawadia, V. and P. R. Kumar, “Power Control and Clustering in Ad Hoc Networks,” Proceedings of the 22st Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2003), San Francisco, CA, USA, April 1-3, 2003. 16. Younis, M., M. Youssef and K. Arisha, “Energy-Aware Routing in Cluster-Based Sensor Networks,” Proceedings of the 10th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS ’02), Texas, October 2002. 111 17. Schwiebert, L., S. K. S. Gupta and J. Weinmann, “Research Challenges in Wireless Networks of Biomedical Sensors,” Proceedings of the 7th Annual International Conference on Mobile Computing and Networking (MOBICOM ’01), pp. 151-165, Rome, July 16-21, 2001. 18. Mainwaring, A., J. Polastre, R. Szewczyk, D. Culler and J. Anderson, “Wireless Sensor Networks for Habitat Monitoring,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 88-97, Atlanta, September 28, 2002. 19. Srivastava, M., R. Muntz and M. Potkonjak, “Smart Kindergarten: Sensor-based Wireless Networks of Smart Developmental Problem-solving Environments,” Proceedings of the 7th Annual International Conference on Mobile Computing and Networking (MOBICOM ’01), pp. 132-138, Rome, July 16-21, 2001. 20. Savvides, A., C. Han and M. B. Srivastava, “Dynamic Fine-Grained Localization in Ad-Hoc Networks of Sensors,” Proceedings of the 7th Annual Intl. Conf. on Mobile Computing and Networking (MOBICOM ’01), pp. 166-179, Rome, July 16-21, 2001. 21. Bulusu, N., J. Heidemann and D. Estrin , “GPS-less Low Cost Outdoor Localization for Very Small Devices,” IEEE Personal Communications Magazine, Vol. 7, No. 5, pp. 28-34, October, 2000. 22. Doherty, L., K. S. J. Pister and L. E. Ghaoui, “Convex Position Estimation in Wireless Sensor Networks,” Proceedings of the 20th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2001), pp. 1655-1663, Anchorage, Alaska, USA, April 22-26, 2001. 23. Savvides, A., H. Park and M. B. Srivastava, “The Bits and Flops of the N-hop Multilateration Primitive for Node Localization Problems,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 112-121, Atlanta, September 28, 2002. 112 24. Nasipuri, A. and K. Li, “A Directionality based Location Discovery Scheme for Wireless Sensor Networks,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 105-111, Atlanta, September 28, 2002. 25. Sohrabi, K., J. Gao, V. Ailawadhi and G.J. Pottie, “Protocols for Self-Organization of a Wireless Sensor Network,” IEEE Personal Communications, Vol. 7, No. 5, pp. 1627, October 2000. 26. Shih, E., S. Cho, N. Ickes, R. Min, A. Sinha, A. Wang and A. Chandrakasan, “Physical Layer Driven Protocol and Algorithm Design for Energy-Efficient Wireless Sensor Networks,” Proceedings of the 7th Annual International Conference on Mobile Computing and Networking (MOBICOM ’01), pp. 272-286, Rome, July 16-21, 2001. 27. Sankarasubramaniam, Y., İ. F. Akyıldız and S. W. McLaughlin, “Energy Efficiency based Packet Size Optimization in Wireless Sensor Networks,” Proceedings of the First IEEE International Workshop on Sensor Network Protocols and Applications (SNPA 2003), Anchorage, AK, USA, May 11-15, 2003. 28. Peterson, W. W. and E. J. Weldon, Jr., Error-Correcting Codes, MIT Press, 1972. 29. Havinga, P. J. M. and G. J. M. Smit, “Design Techniques for Low-Power Systems,” Journal of Systems Architecture, Vol. 46, pp. 1-21, 2000. 30. Pahlavan, K. and A. H. Levesque, Wireless Information Networks, John Wiley & Sons, 1995. 31. Rappaport, T. S., Wireless Communication Principles and Practice, Prentice-Hall, 2002. 32. Freier, G. D., University Physics Experiment and Theory, Meredith Publishing Company, 1965. 113 33. Raghunathan, V., C. Schurgers, S. Park and M. B. Srivastana, “Energy-Aware Wireless Microsensor Networks,” IEEE Signal Processing Magazine, Vol. 19, No. 2, pp. 40-50, March 2002. 34. Feeney, L. M., “An Energy-Consumption Model for Performance Analysis of Routing Protocols for Mobile Ad Hoc Networks,” ACM Journal of Mobile Networks and Applications, Vol. 3, No. 6, pp. 239-249, 2001. 35. Carle, J. and D. Simplot-Ryl, “Energy-Efficient Area Monitoring for Sensor Networks,” Computer, to appear. 36. Gondran, M. and M. Minoux, Graphs and Algorithms, John Wiley & Sons Ltd., 1984. 37. Diestel, R., Graph Theory, Springer-Verlag, New York, 2000. 38. Staddon, J., D. Balfanz and G. Durfee, “Efficient Tracing of Failed Nodes in Sensor Networks,” Proceedings of the First ACM International Workshop on Wireless Sensor Networks and Applications (ACM WSNA 2002), pp. 122-130, Atlanta, September 28, 2002. 39. Bhat, U. N., Elements of Applied Stochastic Processes, John Wiley & Sons, Inc., 1972. 40. Rodoplu, V. and T. H. Meng, “Minimum Energy Mobile Wireless Networks,” IEEE Journal on Selected Areas in Communications, Vol. 17, No. 8, pp. 1333-1344, August 1999. 41. Gomez, J., A. T. Campbell, M. Naghshineh and C. Bisdikian, “Conserving Transmission Power in Wireless Ad Hoc Networks,” Proceedings of the 9th International Conference on Network Protocols (ICNP 2001), Riverside, California, Nov. 11-14, 2001. 114 42. Wattenhofer, R., L. Li, P. Bahl and Y. Wang, “Distributed Topology Control for Power Efficient Operation in Multihop Wireless Ad Hoc Networks,” Proceedings of the 20th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2001), pp. 1388-1397, Anchorage, Alaska, USA, April 22-26, 2001. 43. Kar, K., M. Kodialam, T. V. Lakshman and L. Tassiulas, “Routing for Network Capacity maximization in Energy-constrained Ad-hoc Networks,” Proceedings of the 22st Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2003), San Francisco, CA, USA, April 1-3, 2003. 44. Wieselthier, J. E., G. D. Nguyen and A. Ephremides, “On the Construction of EnergyEfficient Broadcast and Multicast Trees in Wireless Networks,” Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2000), pp. 585 - 594, Tel-Aviv, Israel, March 26 - 30, 2000. 45. Chen, P., B. O'Dea and E. Callaway, “Energy Efficient System Design with Optimum Transmission Range for Wireless Ad Hoc Networks,” Proceedings of the 2002 IEEE International Conference on Communications (ICC 2002), pp. 945-952, New York, NY USA, April 28-May 2, 2002. 46. Heinzelman, W. R., A. Chandrakasan and H. Balakrishnan, “Energy-Efficient Communication Protocol for Wireless Microsensor Networks,” Proceedings of the Hawaii International Conference on System Sciences, Maui, Hawaii, January 4-7, 2000. 47. Catovic, A., Ş. Tekinay and T. Otsu, “Reducing Transmit Power and Extending Network Lifetime via User Cooperation in the Next Generation Wireless Multihop Networks,” Journal of Communications and Networks, Vol. 4, No. 4, pp. 351-362, December, 2002. 115 48. Monks, J., J. P. Ebert, W. M. W. Hwu and A. Wolisz, “Energy Saving and Capacity Improvement Potential of Power Control in Multi-Hop Wireless Networks,” Computer Networks, No. 41, pp. 313-330, 2003. 49. Opnet Technologies, Inc., Opnet Modeler Home Page, http://www.opnet.com/ products/modeler/home.html, 2004. 50. Bertsekas, D. and R. Gallager, Data Networks, Prentice-Hall, 1992. 51. Duracell, Alkaline-Manganese Dioxide Technical Bulletin, http://www.duracell.com/ oem/Pdf/others/ATB-full.pdf, 2003. 52. Chang, J. and L. Tassiulas, “Energy Conserving Routing in Wireless Ad-hoc Networks,” Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2000), pp. 22 - 31, Tel-Aviv, Israel, March 26 - 30, 2000. 53. Catedra, M. F. and J. Perez-Arriaga, Cell Planning for Wireless Communications, Artech House Publishers, 1999. 54. Hair, J. J. F., R. E. Anderson, R. L. Tatham and W. C. Black, Multivariate Data Analysis with Readings, Prentice-Hall, 1995. 55. Hartigan, J. A., Clustering Algorithms, Wiley, New York, 1975. 56. Hartigan, J. A. and M. A. Wong, “A K-Means Clustering Algorithm,” Applied Statistics, Vol. 28, pp. 100-108, 1979. 57. Kohonen, T., Self-Organizing Maps, Springer-Verlag, New York, 1995. 58. Neural Networks Research Centre, HUT, SOM-PAK Web site, HUT-CIS-Research SOM_PAK, LVQ_PAK, http://www.cis.hut.fi/research/som_lvq_pak.shtml, 2004. 116 59. Alcouffe, A. and G. Muratet, “Optimal Location of Plants,” Management Science, Vol. 23, No. 3, pp. 267-274, November, 1976. 60. Akınç, U. and B. M. Khumawala, “An Efficient Branch and Bound Algorithm for the Capacitated Warehouse Location Problem,” Management Science, Vol. 23, No. 6, pp. 585-594, February, 1977. 61. Buffey, T. B., “Location Problems Arising in Computer Networks,” Journal of the Operations Research Society, Vol. 40, No. 4, pp. 347-354, 1989. 62. Filho, V. J. M. F. and R. D. Galvao, “A Tabu Search Heuristic for the Concentrator Location Problem,” Location Science, No. 6, pp. 189-209, 1998. 63. Thompson, D. R. and G. L. Bilbro, “Comparison of a Genetic Algorithm with a Simulated Annealing Algorithm for the Design of an ATM Network,” IEEE Communication Letters, Vol. 4, No. 8, August, 2000. 64. Boorstyn, R. R. and H. Frank, “Large-Scale Network Topological Optimization,” IEEE Transactions on Communications, Vol. com-25, No. 1, pp. 29-47, January, 1977. 65. Pirkul, H., “Efficient Algorithms for the Capacitated Concentrator Location Problem,” Computers and Operations Research, Vol. 14, No. 3, pp. 197-208, 1987. 66. Skorin-Kapov, D., “On a Cost Allocation Problem Arising from a Capacitated Concentrator Covering Problem,” Operations Research Letter, Vol. 13, No. 5, June, 1993. 67. Gavish, B., “Topological Design of Computer Communication Networks-The Overall Design Problem,” European Journal of Operations Research, Vol. 58, No. 2, April, 1992.