NODE ACTIVATION POLICIES FOR ENERGY-EFFICIENT COVERAGE IN RECHARGEABLE SENSOR SYSTEMS By Neeraj Jaggi A Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Major Subject: Computer and Systems Engineering Approved by the Examining Committee: Koushik Kar, Thesis Adviser Ananth Krishnamurthy, Member Alhussein A. Abouzeid, Member Shivkumar Kalyanaraman, Member Rensselaer Polytechnic Institute Troy, New York May 2007 NODE ACTIVATION POLICIES FOR ENERGY-EFFICIENT COVERAGE IN RECHARGEABLE SENSOR SYSTEMS By Neeraj Jaggi An Abstract of a Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Major Subject: Computer and Systems Engineering The original of the complete thesis is on file in the Rensselaer Polytechnic Institute Library Examining Committee: Koushik Kar, Thesis Adviser Ananth Krishnamurthy, Member Alhussein A. Abouzeid, Member Shivkumar Kalyanaraman, Member Rensselaer Polytechnic Institute Troy, New York May 2007 c Copyright 2007 by Neeraj Jaggi All Rights Reserved ii CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii ACKNOWLEDGMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 1.2 1.3 Rechargeable Sensor Systems . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Effect of Spatial Correlation . . . . . . . . . . . . . . . . . . . 3 1.1.2 Effect of Temporal Correlation . . . . . . . . . . . . . . . . . . 3 Node Activation Question . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Need for Localized Algorithms . . . . . . . . . . . . . . . . . . 4 Contributions of this Thesis . . . . . . . . . . . . . . . . . . . . . . . 5 2. Background and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1 Energy Management in Ad-hoc and Sensor Networks . . . . . . . . . 10 2.2 Coverage and Connectivity Issues . . . . . . . . . . . . . . . . . . . . 12 2.3 Correlation Modeling in Sensor Networks . . . . . . . . . . . . . . . . 14 3. Node Activation in Rechargeable Sensor Systems . . . . . . . . . . . . . . 15 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 3.4 3.2.1 Performance Metric . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.2 Challenges and Basic Approach . . . . . . . . . . . . . . . . . 19 System Model and Assumptions . . . . . . . . . . . . . . . . . . . . . 20 3.3.1 Sensor Lifetime Models . . . . . . . . . . . . . . . . . . . . . . 21 3.3.2 Threshold Activation Policies . . . . . . . . . . . . . . . . . . 22 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4.1 Upper Bound on ŪI∗ and ŪC∗ . . . . . . . . . . . . . . . . . . . 24 3.4.2 Threshold Activation Policies for the IL Model . . . . . . . . . 26 3.4.3 Threshold Activation Policies for the CL Model . . . . . . . . 29 3.4.4 Comparison of IL and CL Models . . . . . . . . . . . . . . . . 30 iii 3.5 3.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.5.1 Performance under IL and CL models . . . . . . . . . . . . . . 34 3.5.2 Distribution Independence . . . . . . . . . . . . . . . . . . . . 37 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4. Distributed Activation Policies under Random Sensor Deployment . . . . . 39 4.1 Distributed Node Activation Algorithm . . . . . . . . . . . . . . . . . 40 4.2 Upper Bound on Optimal Time-average Utility 4.3 Sensor Lifetime Models . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4 Simulation Results and Discussion . . . . . . . . . . . . . . . . . . . . 43 . . . . . . . . . . . . 41 5. Node Activation in Partially Rechargeable Sensor Systems . . . . . . . . . 48 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2 Model, Formulation and Contribution . . . . . . . . . . . . . . . . . . 50 5.3 5.4 5.5 5.6 5.7 5.2.1 System Model and Formulation . . . . . . . . . . . . . . . . . 50 5.2.2 Methodology and Contribution . . . . . . . . . . . . . . . . . 51 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.3.1 Recharge and Discharge Process Models . . . . . . . . . . . . 53 5.3.2 Identical Sensor Coverage . . . . . . . . . . . . . . . . . . . . 54 5.3.3 Spatial Correlation Models . . . . . . . . . . . . . . . . . . . . 54 5.3.4 Upper Bound on the Optimal Time-average Utility . . . . . . 56 Aggressive Activation Policy . . . . . . . . . . . . . . . . . . . . . . . 58 5.4.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.4.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 64 Threshold Activation Policies . . . . . . . . . . . . . . . . . . . . . . 66 5.5.1 Analysis of Threshold Policies . . . . . . . . . . . . . . . . . . 68 5.5.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 75 Activation Policies in a General Network Scenario . . . . . . . . . . . 77 5.6.1 Distributed Threshold Activation Algorithm . . . . . . . . . . 78 5.6.2 Choice of Threshold . . . . . . . . . . . . . . . . . . . . . . . 79 5.6.3 Discharge and Recharge Event Models . . . . . . . . . . . . . 81 5.6.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 82 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 iv 6. Rechargeable Sensor Activation under Temporally Correlated Events . . . 87 6.1 6.2 6.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1.1 On-Off Periods . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.1.2 System Observability . . . . . . . . . . . . . . . . . . . . . . . 90 6.1.3 Activation Policies . . . . . . . . . . . . . . . . . . . . . . . . 91 6.1.3.1 Aggressive Wakeup (AW) policy . . . . . . . . . . . 91 6.1.3.2 Correlation-dependent Wakeup (CW) policy . . . . . 91 Activation under Perfect State Information . . . . . . . . . . . . . . . 91 6.2.1 Upper Bound on Achievable Performance . . . . . . . . . . . . 92 6.2.2 Optimal Policy . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.2.3 Optimal Policy evaluation using Value Iteration . . . . . . . . 98 6.2.4 Activation Algorithm . . . . . . . . . . . . . . . . . . . . . . . 98 Activation under Imperfect State Information . . . . . . . . . . . . . 99 6.3.1 Structure of Optimal Policy . . . . . . . 6.3.1.1 MDP Formulation . . . . . . . 6.3.1.2 Properties of ǫ-optimal policies 6.3.1.3 Optimal Policy evaluation using . . . . . . . . . . . . . . . . . . . . . . . . . . . Value Iteration . . . . . . . . . . . . 99 100 104 107 6.3.2 Energy 6.3.2.1 6.3.2.2 6.3.2.3 Balancing Correlation-dependent Wakeup Policies . Upper Bound on CWP Performance . . . . . . . Performance of EB-CW policy . . . . . . . . . . Performance Effects of Boundary Conditions . . . . . . . . . . . 110 110 112 113 6.3.3 Performance of AW Policy . . . . . . . . . . . . . . . . . . . . 115 6.3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 117 6.4 Temporally Uncorrelated Event Occurrence . . . . . . . . . . . . . . . 119 6.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 121 7. Effect of Temporal Correlations in multiple-sensor systems . . . . . . . . . 123 7.1 7.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.1.1 Performance Metric . . . . . . . . . . . . . . . . . . . . . . . . 124 7.1.2 Threshold based Activation Policies . . . . . . . . . . . . . . . 124 7.1.2.1 Time-invariant Threshold Policy (TTP) . . . . . . . 125 7.1.2.2 Correlation-dependent Threshold Policy (CTP) . . . 125 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.2.1 Upper Bound on Achievable Performance . . . . . . . . . . . . 126 7.2.2 Performance of Threshold Policies . . . . . . . . . . . . . . . . 127 v 7.2.3 7.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 131 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 8. Concluding Remarks and Future Work . . . . . . . . . . . . . . . . . . . . 136 8.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 LITERATURE CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 vi LIST OF TABLES 3.1 Performance ratio of threshold policies for low detection probability . . 36 3.2 Performance ratio of threshold policies for high detection probability . . 36 6.1 ǫ-Optimal actions for perfect state information . . . . . . . . . . . . . . 96 6.2 Optimal actions for sample cases . . . . . . . . . . . . . . . . . . . . . . 98 7.1 Performance for various threshold pairs . . . . . . . . . . . . . . . . . . 132 vii LIST OF FIGURES 1.1 Wireless Sensor Networks: Applications . . . . . . . . . . . . . . . . . . 3.1 Rechargeable Sensor: States and Transitions . . . . . . . . . . . . . . . 16 3.2 Utility Function based Performance Metric . . . . . . . . . . . . . . . . 18 3.3 Queuing network representation for the IL model . . . . . . . . . . . . . 31 3.4 Queuing network representation for the CL model . . . . . . . . . . . . 31 3.5 Queuing network model of intermediate system 3.6 Performance of threshold policies for low detection probability . . . . . 34 3.7 Performance of threshold policies for high detection probability . . . . . 35 3.8 Transient sensor system behavior . . . . . . . . . . . . . . . . . . . . . . 36 3.9 Threshold policy performance under various distributions of recharge and discharge intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1 Performance with global thresholds . . . . . . . . . . . . . . . . . . . . 44 4.2 Performance with local thresholds . . . . . . . . . . . . . . . . . . . . . 44 4.3 Performance with global thresholds and event correlation . . . . . . . . 45 4.4 Performance with local thresholds and event correlation . . . . . . . . . 45 5.1 Quantum-queue model of a sensor . . . . . . . . . . . . . . . . . . . . . 50 5.2 Performance of Aggressive Activation Policy . . . . . . . . . . . . . . . 64 5.3 Queuing System representation of sensor system . . . . . . . . . . . . . 67 5.4 IDR Modification Models . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.5 CDR Modification Models . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.6 Performance of Threshold Policies . . . . . . . . . . . . . . . . . . . . . 76 5.7 Network Performance with local thresholds . . . . . . . . . . . . . . . . 84 6.1 Energy discharge-recharge model of the sensor . . . . . . . . . . . . . . 88 6.2 Temporally correlated event occurrence . . . . . . . . . . . . . . . . . . 89 viii 2 . . . . . . . . . . . . . 32 6.3 Threshold energy wakeup functions . . . . . . . . . . . . . . . . . . . . 109 6.4 Threshold energy wakeup functions for symmetric case . . . . . . . . . . 109 6.5 Sensor activation under CW policy 6.6 Event occurrence upon activation for AW policy . . . . . . . . . . . . . 116 6.7 Performance of CW Policies . . . . . . . . . . . . . . . . . . . . . . . . 118 7.1 Performance of Threshold Policies . . . . . . . . . . . . . . . . . . . . . 133 ix . . . . . . . . . . . . . . . . . . . . 111 ACKNOWLEDGMENT I am grateful to my research advisor Prof. Koushik Kar for his guidance and support throughout the course of my PhD. His enthralling ideas and witty comments provided me with the needed encouragement and paved the way towards a successful thesis. I am indebted to Prof. Ananth Krishnamurthy for his kind and gentle attitude, and for the fruitful discussions we had during the process of this research. I am thankful to Prof. Alhussein A. Abouzeid and to Prof. Shivkumar Kalyanaraman for providing their consents to be a part of my doctoral committee. My friends Nabhendra Bisnik, Vicky Sharma and Vijay Subramanian need special mention here, as they have been a substantial part of my life at RPI. Sharing their experiences and discussing my own experiences with them, helped me grow personally as well as professionally. I am grateful to Shu Han, who provided supportive companionship throughout the course of my stay at RPI. I am grateful to my brother Pankaj for providing me with the initial motivation to pursue a PhD, and to his friend Prof. Biplab Sikdar for encouraging me to join RPI. Last but not the least, I acknowledge the sincere efforts and affection of my family and friends, who stood by me throughout, and without whose support my PhD would not have materialized. x ABSTRACT Advances in sensor network technology enable sensor nodes with renewable energy sources, e.g. rechargeable batteries, to be deployed in the region of interest. The random nature of discharge and recharge, along with spatio-temporal correlations in event occurrences pose significant challenges in developing energy-efficient algorithms for sensor operations. An important issue in a system of rechargeable sensors is the node activation question − How, when and for how long should a sensor node be activated so as to optimize the quality of coverage in the system ? We consider two different energy consumption models for a sensor, namely (i) Full Activation model, where a sensor could only be activated when fully recharged, and (ii) Partial Activation model, where the sensor can be activated even when it is partially recharged. In the presence of spatial correlations in the discharge and/or recharge processes, with identical sensor coverages, we show analytically that there exists a simple threshold activation policy that achieves a performance of at least 3 4 of the optimum over all policies under Full Activation, and is asymptotically optimal with respect to sensor energy bucket size under Partial Activation. We extend threshold policies to a general sensor network where each sensor partially covers the region of interest, and demonstrate through simulations that a local information based threshold policy achieves near-optimal performance. We then consider the scenario where the events of interest show significant degree of temporal correlations across their occurrences, and pose the rechargeable sensor activation question in a stochastic decision framework. Under complete state observability, we outline the structure of a class of deterministic, memoryless policies that approach optimality as the sensor energy bucket size becomes large. Under partial observability, we outline the structure of the history-dependent optimal policy, and develop a simple, deterministic, memoryless activation policy based upon energy balance which achieves near-optimal performance under certain realistic assumptions. With multiple sensors having identical coverages, threshold based activation policies achieve near-optimal performance. The energy-balancing thresh- xi old policies are thus robust to spatio-temporal correlations in the discharge and recharge phenomena. xii CHAPTER 1 Introduction Major innovations in hardware technologies in recent years have led to the development of small, low-cost sensor devices. It is envisioned that in the next few decades, these devices will be deployed in large numbers over vast areas, with the purpose of gathering data from the deployment region. Applications of such large-scale data gathering systems are numerous, and include military surveillance, environmental and health monitoring, and disaster recovery [3]. The data gathered could be used for various purposes, like constructing a temperature or pollution map of the region, determining the location of a herd of wild animals or a shoal of fish, or detecting unusual movements in the area under surveillance. Figure 1.1 1 depicts some of the envisaged applications of a large-scale Wireless Sensor Network. In many of these applications, however, sensors are heavily constrained in terms of energy. Sensors are often powered by battery, and limitations on the size of the sensor puts constraints in terms of the battery energy. In applications that involve long-term monitoring such as those depicted in figure 1.1, typical battery lifetimes may be significantly smaller than the time over which the region of interest needs to be monitored. Therefore, effective management of energy is crucial to the performance of these large-scale sensing systems. For energy-efficiency reasons, we would like to activate (or “switch on”) a sensor only when it is expected to improve the system performance significantly; the rest of the time, the sensor can remain in an inactive state (or “switched off”) so as to conserve energy. 1.1 Rechargeable Sensor Systems A large number of sensor network applications involve monitoring of a geo- graphically vast area over an extended period of time. Since the deployment region is vast, and often inaccessible, periodic replacement of sensor batteries may not be a viable solution. For long term monitoring of such environments, sensors can be 1 courtesy http : //www.zess.uni-siegen.de/cms/f ront content.php?idcat = 76 1 2 Figure 1.1: A variety of envisioned applications of Wireless Sensor Networks. deployed with rechargeable batteries which are capable of harnessing the energy from renewable sources in the environment such as solar power [40, 60]. Note that recharging can be a very slow process, possibly influenced by random environmental factors like the intensity of sunlight, speed of wind etc. Typically, the average rate of recharging would be significantly less than the average energy discharge rate during the sensing period. As a result, a sensor could need to spend most of its lifetime in the “off” state, when it is not sensing, but only recharging. In addition, these sensor devices, although cheap, are typically unreliable. Therefore, to improve sensing reliability, we want multiple sensors to cover the area of interest simultaneously. These factors motivate redundant deployment of sensors to cover the area of interest, so that the sensor system remains operational with high probability at any given time. If larger number of sensors are deployed, it is likely that more number of these sensors would remain charged (and hence can be used for sensing) at any given time. 3 1.1.1 Effect of Spatial Correlation In general, the random process that governs the sensing environment will show a significant degree of spatial correlation. As an example, pollution level at any point is expected to have high correlation with respect to both space and time. In other words, if the pollution level at a point is currently above a certain threshold, it is likely to exceed this threshold at neighboring points in the near future. Since the energy discharge rates at sensors often depend on the detection and reporting of such interesting events, the energy discharge process at sensors, which are active at any particular time, could show some degree of spatial correlation. Moreover, in certain cases, the recharging process could also exhibit spatial correlations to a significant extent. This is true, for instance, in sensors that are recharged by solar power, since intensity of sunlight exhibits correlation in space and time. The dynamic nature of the sensing environment dictates that an efficient data gathering strategy must be adaptive in nature. The presence of spatial correlations in the discharge and/or recharge processes complicate the design of the optimal data gathering strategy. In some cases, prior knowledge of the degree of correlation could be used to activate or deactivate the sensors appropriately so as to optimize system performance. Spatial correlation of energy discharge and recharge processes could significantly affect the system performance. Thus, there is a need to develop data gathering strategies which would be robust even in the presence of spatial correlations. 1.1.2 Effect of Temporal Correlation In addition, the event occurrence process might exhibit significant correlation in time. For instance, if the temperature at any location in a forest rises above 100◦F (representing a possibility of forest fire), then with high probability it will remain above the given threshold in the near future as well. Similarly, if the temperature is much below a critical threshold, it is expected to remain so for a while. Smart sensor node activation decision policies should take into account the degree of temporal correlation in (and the current status of) the event occurrence process, while deciding to activate or deactivate (put to sleep) a sensor node dynamically. 4 1.2 Node Activation Question The data gathering objective in a sensor network is to reliably sense and communicate the events observed by the sensor system. Loosely speaking, an event is an interesting occurrence at any point in the monitored region that we would like to know about. For instance, in pollution monitoring applications, an event can correspond to the level of pollution exceeding a predetermined level at any point in the region. Since detecting and reporting of events requires energy and sensors are energy constrained, developing efficient data gathering strategies is closely related to efficient use of sensor energy. In addition, the low-cost and failure-prone nature of sensor nodes, together with infeasibility of accurate sensor placement under dire circumstances, leads to a random and redundant deployment of such nodes. We pose the data gathering problem for rechargeable sensor systems as a dynamic sensor node activation question. The dynamic node activation question involves determining when each sensor should be involved in data gathering (activation) and when they should be put in the sleep mode (deactivation), so as to maximize the long-term reliability index of the system. The measure of system reliability can be formulated in terms of the quality of coverage provided or by the event detection probability in the system. Note that since the sensors are heavily energy-constrained, activating a sensor whenever possible may not be a good node activation strategy. The key to obtaining efficient node activation policies is to activate some sensors currently, while keeping a sufficient number of sensors in store for future use. For a system with a single sensor node, efficient activation decisions would lead to a larger fraction of events being detected in the long run. 1.2.1 Need for Localized Algorithms Since events and sensor discharge and/or recharge constitute random pro- cesses, obtaining optimal solutions to the dynamic node activation questions we consider, require solving complex stochastic decision problems. However, these problems are computationally very difficult to solve optimally even in simple special cases, particularly when the region of interest is covered by all the sensor nodes in the system. Moreover, exact solutions to these stochastic decision questions require 5 global knowledge and coordination, and can only be useful as a static or off-line approach. Whereas, in practical scenarios, a sensor node would typically have access to only local topology and state information, and would be required to take an activation decision based only upon this local information. Hence, there is a need to develop distributed, low-overhead and local information based algorithms towards addressing the node activation questions in a rechargeable sensor system. 1.3 Contributions of this Thesis We start by answering the node activation question for a system of recharge- able sensors, wherein the sensor nodes could only be activated when fully charged. To find the optimal sensor node activation policy in such a case is a very difficult decision question, and under Markovian assumptions on the sensor discharge/recharge periods, it represents a complex semi-Markov decision problem. With the goal of developing a practical, distributed but efficient solution to this complex, global optimization problem, we first consider the activation question for a set of sensor nodes, where all the sensors are able to cover the region of interest (identical coverage). For this scenario, we show analytically that there exists a simple threshold activation policy that achieves a performance of at least 3 4 of the optimum over all policies. We extend this threshold policy to a general network setting where each sensor partially covers the region of interest, and the coverage areas of different sensors could have partial or no overlap with each other, and show using simulations that the performance of our policy is very close to that of the globally optimal policy. Our policy is fully distributed, and requires a sensor to only keep track of the node activation states in its immediate neighborhood. We also consider the effects of spatial correlation on the performance of threshold policies, and the choice of the optimal threshold. We then consider the case where the recharge process at a sensor node is a continuous process, regardless of the current state (active or inactive) of the sensor node. This scenario is motivated by the fact that renewable energy sources, such as sunlight, could drive the rechargeable batteries at the sensor nodes. In this system model, a sensor node can hold upto K quanta of energy, and can be activated even 6 when it is partially recharged. For the case of identical sensor coverages, we show that the class of threshold policies is asymptotically optimal with respect to K i.e. the performance of such a policy for a chosen threshold parameter approaches the optimal performance as K becomes large. We also show that the performance of the optimal threshold policy is robust to the degree of spatial correlation in the discharge and/or recharge processes. We then extend this approach to a general sensor network, and demonstrate through simulations that a local information based threshold policy, with an appropriately chosen threshold, achieves a performance which is very close to the global optimum. Finally, we consider the node activation question where the events of interest show significant degree of temporal correlations across their occurrences. The optimization question in such systems is − how should the rechargeable sensor be activated in time so that the number of interesting events detected is maximized under the typical slow rate of recharge of the sensor. We first consider the activation question for a single sensor, and pose it in a stochastic decision framework. The recharge-discharge dynamics of a rechargeable sensor node, along with temporal correlations in the event occurrences makes the optimal sensor activation question very challenging. Under complete state observability, we outline the structure of a class of deterministic, memoryless policies that approach optimality as the energy bucket size at the sensor becomes large; in addition, we provide an activation policy which achieves the same asymptotic performance but does not require the sensor to keep track of its current energy level. For the more practical scenario, where the inactive sensor may not have complete information about the state of event occurrences in the system, we outline the structure of the deterministic, history-dependent optimal policy. We then develop a simple, deterministic, memoryless activation policy based upon energy balance and show that this policy achieves near optimal performance under certain realistic assumptions. For the case with multiple sensors having identical coverage, we show that threshold based activation policies are, in general, robust to temporal correlations and achieve near optimal performance. This thesis is organized as follows. We review the background literature on energy efficiency in wireless ad-hoc and sensor networks in Chapter 2. We also discuss 7 related work in energy management and correlation modeling in sensor networks and outline the approaches and techniques proposed in the past. We present the dynamic sensor node activation question in the rechargeable sensor systems in Chapter 3. We start with modeling the sensor states and transitions with respect to their energy levels. We then formulate a utility based global perfomance criteria which serves as a measure of performance for the node activation policies we develop and analyze. We study the performance characteristics of a certain class of activation policies, namely threshold activation policies and derive analytical bounds on the performance of such policies for a chosen threshold parameter, for the case where all the sensor nodes can completely cover the region of interest. Threshold activation policies are particularly of interest, due to their simplicity and the ease of distributed implementation based only upon local information. We also study the effects of spatial correlation on the performance of threshold activation policies and show that these effects are dependent upon the threshold parameter chosen. We then extend these threshold based policies to a distributed network setting in Chapter 4, wherein a sensor node covers the region of interest only partially, and the coverage areas of any two sensor nodes may overlap partially, completely or none at all, depending upon the random placement of the sensor nodes in the network. We develop distributed implementations of threshold policies, evaluate their performance through extensive simulations and show that these policies perform very close to the optimal performance for an appropriate choice of threshold. Particularly, we show that a local information based policy achieves near-optimal performance in a randomly deployed network of rechargeable sensors. The sensor model considered in Chapters 3 and 4 is restrictive since it does not allow a sensor node to be activated while it is being recharged. In practice, the recharge process may be a continuous process occuring at all the sensor nodes simulatenously and at all times. This motivates the need to be able to activate the sensor even when it is partially recharged i.e. as long as it has a non-zero energy level. In Chapter 5, we model the partially rechargeable sensor nodes such that they can hold upto K quanta of energy and can be activated at any time if they have 8 the sufficient energy level to be able to sense. We model the recharge and discharge processes at the sensor nodes as poisson processes and study the performance of the class of threshold activation policies in such partially rechargeable sensor systems. Particularly, we show that the class of threshold policies is asymptotically optimal with respect to the energy buffer size K for the special case where all the sensors are able to cover the region of interest i.e. for an appropriately chosen threshold, the performance of the threshold activation policy approaches the optimal performance as K → ∞. We also model spatial correlation in the discharge and recharge processes at the sensor nodes and show the robustness of the chosen threshold policy in the presence of such correlations. Similar to the technique we followed for rechargeable sensor systems, we extend our threshold policies to a distributed network of partially rechargeable sensors in Chapter 5. We develop distributed threshold based activation policies, evaluate their performance through extensive simulations and show that these policies perform very close to the optimal performance for an appropriate choice of threshold. Next, we consider the node activation question in rechargeable sensor systems where the occurrence of interesting events is correlated in time. We first consider the effect of such temporal correlations on the performance of systems with a single sensor node in Chapter 6. We model the sensor system evolution under complete observability as a Markov decision process, and under partial observability as a Partially Observable Markov decision process. Under complete state observability, we show that a simple, correlation-dependent activation policy achieves optimal performance for large energy bucket size K. Under partial observability, where the inactive sensor may not have complete information about the state of event occurrences in the system, we outline the structure of the deterministic, history-dependent optimal policy. We observe that the optimal policy is heavily dependent on system parameters, and is not easily implementable in practice. Therefore, we develop a simple, deterministic, memoryless activation policy which is based upon energy balance and achieves near optimal performance under certain realistic assumptions. We consider the node activation question for a recharegable sensor system with multiple sensors in Chapter 7, and show that threshold policies are, in general, robust 9 to the presence of temporal correlations across events. We develop threshold based activation policies which achieve near optimal performance under these scenarios. Chapter 8 summarize our results and conclusions. We also provide further directions to future research work in this newly formulated and promising area of research. CHAPTER 2 Background and Related Work There has been tremendous research interest in ad-hoc and sensor networks in recent years. An excellent survey of different sensor networks applications, as well as a discussion on some major issues in sensor networks, is provided in [3]. Some of the important issues considered in wireless sensor networks include coverage, connectivity, energy efficiency, data aggregation, and network lifetime. A survey of various algorithms employed to address the above issues in sensor networks is provided in [63]. In recent years, there has been a considerable degree of interest in energy management issues in individual sensors, sensor systems, and wireless adhoc networks. We outline some of these contributions in Section 2.1. A measure of energy-efficiency in non-rechargeable sensor networks is the network lifetime. There have been approaches suggested to extend the network lifetime in the presence of coverage and/or connectivity constraints, and energy-constrained sensor nodes. We discuss issues related to energy-efficient coverage and connectivity in Section 2.2. Section 2.3 disusses approaches used to model spatio-temporal correlations in sensor networks. Note that, there does not exist sufficient literature on the management of energy-constrained rechargeable sensor systems, and on the effect of spatio-temporal correlations while managing such sensor systems. Therefore, many of these perspectives listed below are not directly related to the node activation questions we consider in rechargeable sensor systems. 2.1 Energy Management in Ad-hoc and Sensor Networks There has been considerable amount of work on energy-efficient medium access control and adaptive wakeup of sensors, although all these perspectives consider energy-constrained, but non-rechargeable sensors. Energy-efficient medium access control protocols have been studied in [21, 22, 61, 80, 81]. The problem of minimizing power consumption during idle times is addressed in [19, 46]. In [10], the authors 10 11 use occupancy theory to analyze the effect of switching off idle nodes on the network lifetime. A discussion on the importance of energy management in ad-hoc and sensor networks, along with a description of various performance objectives, is outlined in [69]. Energy-efficient battery management strategies have been studied in [1, 2]. [20] proposes a framework that allows each battery-powered terminal to autonomously derive its optimal power management policy. Through the derived policy, an optimal trade-off between packet loss probability and mean packet delay on one hand, and energy consumption on the other hand is obtained. Energy-conscious medium access control and scheduling has been considered in [5, 59, 62]. [39] proposes adaptively choosing the ATIM (Adhoc traffic indication message) window according to the network load in order to save energy without degrading throughput. [58] considers transmitting a packet over a longer time period to reduce power consumption, and involves a trade-off between delay incurred and energy consumption. In [54], the effects of power conservation, coverage and cooperation on data dissemination is investigated for a particular data sharing architecture. Optimization based energy-efficient routing strategies have been studied in [17, 18, 42, 49]. [17] uses an exponential cost function defined in terms of residual energy at the nodes and the link costs in order to select routes and power levels such that the network lifetime is maximized. In the process, the energy consumption rates at the nodes turn out to be proportional to their residual energies. Various other energy-efficient routing protocols have been proposed in [25, 67]. [30] proposes LEACH (Low energy adaptive clustering heirarchy), which is a self-organizing, localized coordination routing protocol, and results in even distribution of energy dissipation, thus enhancing the network lifetime. Tradeoffs between energy and robustness in ad-hoc network routing is studied in [47], where the authors argue that single path routing with high power can also be energy efficient, compared to the conventional approach of multipath routing to provide robustness. In [82], the authors study the trade-offs between routing latency and energy usage, and provide methods of computing efficient data gathering trees. Other interesting work related to energy-minimization include [31, 72, 77, 84]. Some of the node activation 12 problems discussed in this thesis are related to the energy management questions outlined above and in [68, 78, 79]. However, most of these works do not consider the node activation question in the context of data gathering applications. Moreover, they do not consider rechargeable sensor systems, or spatio-temporally correlated event phenomena. Lin et. al [50] consider sensors with renewable energy sources and focus on the development of an efficient routing strategy in a network with rechargeable sensors. It uses a worst-case competitive analysis approach in comparison to the stochastic decision framework, as in our case. Machine learning approaches towards adaptive power management have been considered in [73], where the authors also discuss the advantages of using a model (POMDP) based approach to model unobservable user behavior during the process of optimal decision making. Borkar et al. [11] consider efficient scheduling of transmissions for an energyconstrained wireless device. The possible decisions for the device include transmission, remaining idle or reordering battery. Here the goal is to minimize the overall cost of transmission decision policy, and the authors show that the optimal decision policy to reorder battery is threshold based, where the threshold represents the relative difference between the current charge level of the battery and the current buffer length. Recently, the problem of controlling the activity of a rechargeable sensor node has been studied in [6], where the Norton’s equivalent of a closed three-queue system is obtained to show optimal rate control policies for various combinations of rate structures and utility functions. 2.2 Coverage and Connectivity Issues The issue of coverage has been studied extensively in the literature. Area coverage, where the goal is to monitor a specified region, has been considered in [70, 83, 75]. Target (or point) coverage has been studied in [13, 15, 14, 41, 24]. Coverage has also been studied from the perspective of maximal support (or breach) path in [53, 65]. In [24], deterministic sensor placement allows for topology-aware placement 13 and role assignment, where nodes can either sense or serve as relay nodes. Zhang and Hou [83] show that if the communication range of sensors is at least twice as large as their sensing range, then coverage implies connectivity. They also develop some optimality conditions for sensor placement and develop a distributed algorithm to approximate those conditions, given a random placement of sensors. An important method for extending the network lifetime for the area coverage problem is to design a distributed and localized protocol that organizes the sensor nodes in sets. The network activity is organized in rounds, with sensors in the active set performing the area coverage, while all other sensors are in the sleep mode. Set formation is done based on the problem requirements, such as energy-efficiency, area monitoring, connectivity, etc. Different techniques have been proposed in literature [75, 83] for determining the eligibility rule, that is, to select which sensors will be active in the next round. This notion of classifying the sensor nodes into disjoint sets, such that each set can independently ensure coverage and thus could be activated in succession, has been considered in [13, 70]. Cardei and Du [13] show that the disjoint set cover problem is NP-complete and propose an efficient heuristic for set cover computations using a mixed integer programming formulation. A similar centralized heuristic for area coverage has been proposed in [70], where the region is divided into multiple fields such that all points in one field are covered by the same set of sensors. Then, a most-constrained least-constraining coverage heuristic is developed which is empirically shown to perform well. In [15] the constraints for the set of sensors to be disjoint and for these sets to operate for equal time intervals, are relaxed and two heuristics, one using linear programming and the other using a greedy approach are proposed and verified using simulation results. [29, 85] consider connected coverage and provide approximation algorithms to find one minimal subset of sensor nodes to guarantee (k-)coverage and connectivity. An intergrated coverage and connectivity framework [75] has also been proposed where the goal is to allow configuring varying degrees of coverage and to maximize the number of sensor nodes scheduled to sleep at each stage. The coverage configuration protocol [75] is integrated with SPAN [19] to ensure connectivity in the network. However, the authors note that the network lifetime in such a framework 14 does not scale linearly with the number of sensing nodes due to periodic beacon exchanges. In [33], a greedy iterative disjoint set computation algorithm is developed to ensure coverage and connectivity in the network, and it is shown through extensive simulations that the network lifetime scales linearly with the number of sensor nodes in the network. 2.3 Correlation Modeling in Sensor Networks [4, 74] consider exploiting spatial and temporal correlations in the sensed data to develop efficient MAC and transport layer communication protocols. The issues faced include controlling the representative sensors (which are allowed to transmit) and their transmitting frequencies, in order to comply with the desired maximum distrotion level and minimizing energy consumption in the process. Information theoretic aspects of correlation in sensor networks have been studied in [26]. Data aggregation schemes to perform routing with compression in the presence of spatial correlations have been studied in [56]. CHAPTER 3 Node Activation in Rechargeable Sensor Systems 3.1 Introduction In this chapter, we consider a system of rechargeable sensor nodes deployed redundantly in the region of interest for monitoring and data gathering purposes. As discussed in Chapter 1, if large number of sensors are deployed, it is likely that more number of these sensors would remain charged (and hence can be used for sensing) at any given time. Thus, the overall system performance would typically improve (possibly with diminishing returns) with a more redundant deployment of sensors. We assume that sensor nodes involved in sensing get discharged after a certain duration of time, and need to be recharged till they can start sensing again (Full Activation model). We consider the decision problem of when the recharged sensors should be activated (i.e. switched “on”), so as to maximize the long-term utility of the system. In Section 3.2, we formalize the node activation problem, describe the performance metric considered and discuss the challenges involved. We elaborate on the system model and the underlying assumptions in Section 3.3. We describe a particular class of node activation policies, namely threshold activation policies in Section 3.3.2. Section 3.4 discusses performance bounds for node activation policies and performance evaluation of specific threshold activation policies. We present numerical results depicting performance of threshold activation policies in comparison to the derived bounds in Section 3.5 and summarize our analytical and numerical results in Section 3.6. 3.2 Problem Formulation At any instant of time, a rechargeable sensor node could be in one of the three states: 15 16 Figure 3.1: Rechargeable Sensor: States and Transitions under Full Activation model • Active: The sensor is sensing or is activated. A sensor in the ‘active’ state suffers a gradual depletion of its battery energy, and enters the ‘passive’ state when its battery gets completely discharged. • Passive: The sensor is switched “off” or deactivated, due to complete discharge of its battery energy. It is simply recharging its battery and is not sensing. • Ready: The sensor has completely recharged its batteries and can be activated or put to sensing. The sensor does not participate in sensing, and waits to get activated. Figure 3.1 explains the three sensor states, and the transitions between them. Let discharge time denote the time a sensor spends in the ‘active’ state, and recharge time denote the time a sensor spends in the ‘passive’ state. In a realistic sensing environment, the discharge and recharge times will depend on various random factors. Sensors can transmit information (resulting in energy usage) on the occurrence of “interesting” events, which may be generated according to a random process. Therefore in our system model, we assume that the discharge and recharge times are random, although we study the special case of deterministic discharge and recharge times as well. Although a sensor can power itself off during the ‘ready’ state, it has to wake up periodically and exchange messages with its neighboring sensors, to keep track 17 of the system state in its neighborhood. Therefore, in reality, we would expect that energy will be drained even in the ‘ready’ state, but probably at a fairly steady rate (possibly due to polling its neighbors to check out the system activation state). However, the energy discharge rate in the ‘ready’ state can be expected to be much slower than the discharge rate in the ‘active’ state. 3.2.1 Performance Metric The ability to equip the sensor node with rechargeable batteries adds a new dimension to the energy management issues in sensor networks. As outlined in Chapter 1, the network lifetime is an appropriate metric to measure the energy efficiency of a proposed routing or node activation policy in networks of non-rechargeable sensors. However, the ability to recharge allows a completely discharged sensor node to regenerate itself after some time. This allows the sensor network to continuously sustain itself, provided there is sufficient redundancy in the number of sensor nodes deployed in the network. In this case, network lifetime no longer remains to be the desired metric. Therefore, an appropriate performance metric needs to be formulated in order to evaluate the performance of node activation policies in such scenarios. We characterize the performance of the rechargeable sensor system by a continuous, non-decreasing, strictly concave function U satisfying U(0) = 0. More specifically, U(n) represents the utility derived per unit area, per unit time, from n active sensors covering an area. Note that different sensors can be located at different points in the overall physical space of interest, and the coverage patterns of different nodes can be different. Therefore, the coverage areas of different sensors will typically be different. This implies that at any time, utilities in different parts of the area of interest can differ significantly from one another. Note that the strict concavity assumption merely states the fact that the system has diminishing returns with respect to the number of active sensors. As an example of a practical utility function, consider the scenario where each sensor can detect an event with probability pd . If the utility is defined as the probability that the sensing system is able to detect an event, then U(n) = 1 − (1 − pd )n , where n 18 Utility Function: U(n) 1 0.9 pd = 0.1 0.8 pd = 0.5 pd = 0.9 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 2 4 6 Number of active sensors (n) 8 10 Figure 3.2: Utility function characteristics for a range of values of detection probability pd . is the number of sensors that are active. Note that this utility function is strictly concave, and satisfies U(0) = 0. Figure 3.2 depicts the shape of this utility function for various values of detection probability pd . The long-term performance is represented by the time-average utility of the system. Let A denote the physical space of interest, and A denote a generic area element in A. Let nP (A, t) denote the number of active sensors that cover area element A at time t, when activation policy P is used. The time-average utility under policy P , is given by 1 lim t→∞ t Z tZ 0 U(nP (A, t)) dA dt . (3.1) A In Euclidean coordinates system, dA = dx dy, and nP (A, t) = nP (x, y, t), in the above expression. The decision problem is that of finding the activation policy P such that the objective function in (3.1) is maximized. As mentioned before, our decision problem is that of determining how many sensors to activate at any time, from the set of ready sensors. Note that if we activate more sensors, we gain utility in the short time-scale. However, if the number of active sensors is already large, since the utility function exhibits diminishing returns, we 19 may want to keep some of the ready sensors “in store” for future use. 3.2.2 Challenges and Basic Approach The stochastic nature of the discharge and recharge times of sensors makes the determination of optimal activation policies very hard in a general setting. Further, spatio-temporal correlations imply that at any point in time, the optimal activation policy for a sensor might depend on the history of the states of all the sensors in the network. Although under specific cases the optimal policies may be formulated as semi-Markov decision problem, determining optimal policies can be computationally prohibitive. Since sensors are energy constrained, we seek policies that can be implemented in a distributed manner with minimal information and computational overhead. Therefore, we focus on simple threshold policies (defined precisely in Section 3.3.2)) and examine their performance. To simplify the analysis and obtain fundamental performance insights, we examine the performance of threshold policies for a system of sensors, such that all the sensors are able to cover the region of interest i.e. the region lies inside the sensing (or coverage) radius of all the sensors in the system. In this case, the objective function in (3.1) reduces to a single integral over the time domain. We consider two extreme correlation models of the discharge and recharge times of the different sensors: one in which these times are highly correlated, and the other in which these times are independent of one another. Assuming that the discharge/recharge times are exponential, we formulate the problem as a continuoustime Markov decision problem and provide a procedure for determining the optimal policy. Since the associated computation complexity is significant, we focus on the class of threshold decision policies. Threshold policies yield closed-form expressions, and the optimal threshold policy can be computed efficiently. Under Markovian assumptions we derive tight bounds on the performance of threshold policies for two different lifetime correlation models of the sensor nodes. Particularly, we show that the time-average utility of the appropriately chosen threshold policy is at least 3 4 of the best possible performance, for both correlation models. Moreover, we show that correlation in the discharge and recharge times of the sensors degrades 20 performance at all threshold values. Through numerical studies, we also show that the performance of the optimal threshold policy is very close to the best achievable performance. These performance bounds motivate our study of the performance of threshold based policies in a very general network setting, where the sensors cover the region of interest only partially, and the coverage areas of different sensors could have partial, complete or no overlap with each other. The threshold policies derived for this scenario can be implemented by sensors in a distributed manner, based only on information about the local network state. Through extensive simulation studies, we show that the performance of our policy is very close to that of a globally optimal policy. Therefore, threshold policies allow us to obtain near-optimal solutions to our complex decision problem in an efficient manner. The distributed implementation and performance evaluation of these policies is discussed in Chapter 4. 3.3 System Model and Assumptions We consider a system of N sensors, all of which cover the region of interest. We assume that the discharge and recharge times of each sensor are random variables. Assumption 1 The discharge time and recharge time of any sensor are exponentially distributed with means 1/µ1 and 1/µ2 respectively. Moreover, µ1 ≥ µ2 . Assumption 2 The energy level of a sensor does not change in the ‘ready’ state. The exponential model of the discharge and recharge times allows better analytical tractability. Moreover, the optimal policies under this assumption depend only on the number of sensors in the different states in the system, and not on their exact energy levels. Without Markovian properties, the system can be very difficult to analyze, and implementing the optimal decision policies (if they can be obtained) would require more detailed system information and additional overhead. The assumption µ1 ≥ µ2 is based on the observation that the rate of recharge in batteries is typically slower than the rate of discharge. Assumption 2 basically states that a sensor remains in the fully charged state as long as it remains in the ‘ready’ state. In reality, we would expect that energy 21 will be drained even in the ‘ready’ state, but probably at a fairly steady rate. Since the energy discharge rate in the ‘ready’ state can be expected to be much slower than the discharge rate in the ‘active’ state, it is ignored in our analysis. The system objective is to find a node activation policy such that the timeaverage utility in the system is maximized. Let nP (t) denote the number of sensors in the ‘active’ state at time t under policy P . Since all sensors are able to cover the region of interest, the optimization problem can be posed as that of finding a policy P that maximizes Ū (P ), where Ū (P ) is defined as Z 1 t Ū (P ) = lim U(nP (t))dt . t→∞ t 0 (3.2) We assume that switching decisions can be taken at any instant of time. Clearly, these decisions would need to be taken only when the state of the overall system changes, i.e. when the number of the sensors in the ‘active’, ‘passive’ or ‘ready’ states changes. In other words, these decisions need to be taken when some sensor makes a transition from the ‘active’ to the ‘passive’ state, or some sensor makes a transition from the ‘passive’ to the ‘ready’ state. 3.3.1 Sensor Lifetime Models We consider two different correlation models of the discharge and recharge times of the different sensors: • Independent Lifetime (IL) Model: In this model, the discharge and recharge times of all sensors are independent. • Correlated Lifetime (CL) Model: In this model, the discharge times of all sensors entering the ‘active’ state at the same time is the same. Similarly, the recharge times of all sensors entering the ‘passive’ state at the same time is the same. The discharge (recharge) time of sensors entering the ‘active’ (‘passive’) state at different times are independent. The two correlation models can be practically motivated as follows. First, consider a scenario where data transmission (on the detection of interesting events) is the primary mode of energy expenditure, as is often the case in practice. Moreover, 22 assume that the detection of an event and/or the subsequent data transmission is a random variable. This could happen, for instance, if a certain failure probability is associated with the detection of an event at each sensor, or if an event is reported only by one or a subset of the sensors (randomly chosen) detecting the event (to avoid redundant event reporting). In such scenarios, the system is better modeled with independent discharge times. Let us now consider a scenario where all active sensors report data on a regular basis, or the data reporting is so infrequent that most of the energy expenditure occurs in sensing and processing. In these cases, the active sensors will all get discharged at the same rate, and therefore, the system is better modeled with correlated discharge times. The cases of independent and correlated recharge times can be motivated as follows. If the sensors are located close to one another, the recharge processes at different sensors are expected to be highly correlated. On the other hand, if the sensors are not closely located, then the system may be better modeled with independent recharge times. As we discuss later in more detail, the performance of node activation policies depend considerably on the degree of correlation between the discharge and recharge times of sensors. Note that these two models, IL and CL, represent two extreme forms of correlation, and real-life situations can be expected to fall in between these two extremes. We will argue that our solutions perform well with respect to both of these extreme forms of correlation. Therefore, our solutions are expected to perform well in intermediate correlation scenarios as well. Note that the optimal time-average utility (computed over all possible activation policies) could be different for the two correlation models. We denote optimal time-average utility for the IL and CL models as ŪI∗ and ŪC∗ , respectively. 3.3.2 Threshold Activation Policies We note that the set of all possible activation policies can be very large, and the structure of these policies can be very complex. Therefore, determining the optimal activation policy for the IL and CL models can be very difficult, and evaluating the optimal time-average utilities, ŪI∗ and ŪC∗ , can be computationally intensive. 23 Algorithm 1 : Threshold Activation (parameter m) for each sensor i in Ready state do Calculate the number of active sensors (n) in the system if n < m then sensor i activates itself else sensor i deactivates itself till the next decision instant end if end for Therefore, we focus primarily on threshold activation policies. A threshold activation policy with parameter m is characterized as follows: A ready sensor s is activated if the number of active sensors does not exceed m after s is activated; otherwise, s is kept in the ‘ready’ state. The threshold activation algorithm is described in Algorithm 1. In other words, a threshold policy with parameter m tries to maintain the number of active sensors as close to m as possible. Note that with such a policy, the number of active sensors can never exceed m, and there can not be any ready sensors in the system when the number of active sensors is less than m. Note that activation decisions should be taken only when the state of the system changes, i.e. when a sensor moves from active to passive state, or when a sensor moves from passive to ready state. In practice, these decision instants could be periodic. However, to simplify the analysis, we assume all the time instants when the system state changes, to be decision instants. The time-average utility achieved by threshold activation policy with parameter m, are denoted by ŪT,I (m) and ŪT,C (m), for the IL and CL models, respectively. 3.4 Analysis In this section, we compare the performance of threshold activation policies with that of the optimal activation policy. Let, ρ = µ1 µ2 ≥ 1. For simplicity of exposition, we assume ρ is an integer, and N is divisible by (ρ + 1), although our results can be generalized to the cases where these assumptions do not hold. 24 Upper Bound on ŪI∗ and ŪC∗ 3.4.1 Since the optimal time-average utility achievable over all policies is difficult to compute, we obtain an upper bound on it, and compare the performance of threshold policies with this bound. Theorem 1 The optimal time-average utility for the two correlation models, ŪI∗ N and ŪC∗ , are both upper-bounded by U( 1+ρ ), i.e., ŪI∗ ≤ U( N N ) and ŪC∗ ≤ U( ). 1+ρ 1+ρ Proof: Let f and p be measurable functions finite a.e. on a set R. Suppose f p and p R are integrable on R, p ≥ 0, and R p > 0. If φ is convex in an interval containing the range of f , then Jensen’s inequality [76] states R R φ(f )p fp R ≤ RR . φ R p p R R Let n(t) denote the number of sensors in the active state at time t. Since U(·) is concave, substituting φ = U(·), f = n(t) and p = 1 in the above, Jensen’s Inequality implies U RT 0 n(t)dt T ! ≥ RT 0 U(n(t))dt . T Since, U(·) is continuous, we have lim U T →∞ RT 0 n(t)dt T ! ≥ lim T →∞ RT 0 U(n(t))dt . T (3.3) Define ψi (t) such that ψi (t) = 1 if sensor i is in active state at time t and ψi (t) = 0 if sensor i is in passive state at time t. Then, continuity of U(·) also implies lim U T →∞ RT 0 n(t)dt T ! =U lim T →∞ RT 0 n(t)dt T ! =U lim T →∞ R T Pi=N 0 i=1 T ψi (t)dt ! . (3.4) 25 Since ψi (t) is positive and bounded, lim U T →∞ RT 0 n(t)dt T ! =U lim T →∞ i=N X i=1 RT 0 ψi (t)dt T ! i=N X =U i=1 lim T →∞ RT 0 ψi (t)dt T ! . (3.5) Further, since all sensors are identical, for any k U lim T →∞ RT 0 n(t)dt T ! =U N lim RT 0 T →∞ ψk (t)dt T ! . (3.6) Since the times each sensor spends in active and passive states are independent, with means 1 µ1 and 1 µ2 respectively, we have 1 ≥ lim 1 + ρ T →∞ RT 0 ψk (t)dt , T (3.7) where the equality holds if the sensor spends zero time in the ready state. From the non-decreasing nature of U(·), and using (3.7) and (3.6), we have U N 1+ρ ≥ U lim T →∞ RT 0 n(t)dt T ! . (3.8) Combining (3.8) and (3.3), we obtain U N 1+ρ ≥U lim T →∞ RT 0 n(t)dt T ! ≥ lim T →∞ RT 0 U(n(t))dt . T (3.9) This implies that the time-average utility under any policy can not be greater that N U( 1+ρ ). In particular, the optimal time-average utility for the two correlation modN els, ŪI∗ and ŪC∗ , are both upper-bounded by U( 1+ρ ), thus proving Theorem 1. Theorem 1 implies that the time-average utility under any policy can not be N ). This result is independent of the distribution of recharge and greater that U( 1+ρ discharge times of the sensors, as long as the mean discharge and recharge times remain constant. Further, the bound is achieved exactly when all the sensors have deterministic discharge and recharge times of lengths 1/µ1 and 1/µ2 , respectively. 26 With random discharge and recharge times, the bound may not be tight; however, as we show below, it is a fairly good bound for the case when the recharge and discharge times are exponentially distributed. Now we derive worst-case bounds on the performance of threshold policies with respect to the optimal policy for the two correlation models. 3.4.2 Threshold Activation Policies for the IL Model Consider a threshold activation policy with parameter m ∈ {1, 2, 3, ..., N}, ∗ where N is the total number of sensors in the system. Then ŪT,I , the optimal ∗ threshold-based time-average utility for the IL model, is defined as ŪT,I = maxN m=1 ŪT,I (m). Next we state a result for the threshold policy with parameter N. Note that with a threshold of N, once a sensor is completely recharged, it is immediately activated. In other words, no sensor is ever kept waiting in the ‘ready’ state. Theorem 2 The time-average utility at threshold N for the IL model, ŪT,I (N), is N lower-bounded by 21 U( 1+ρ ), i.e, 1 N ŪT,I (N) ≥ U( ). 2 1+ρ Proof: Using steady state Markov chain analysis (see [45] for details), the time-average utility of the system, ŪT,I (m), can be computed as ŪT,I (m) = PN i=1 U(i)α(i, m) , P N α(i, m) i=0 (3.10) where α(i, m), i = 1, 2, ..., N, are defined as α(i, m) = N ρ−i i i!ρ−i N i m!mi−m if i ≤ m, otherwise. (3.11) Using the above expressions, time-average utility obtained for a threshold of m = N 27 is ŪT,I (N) = N X U(i) i=0 We define w = ŪT,I (N) = N X i=0 N ρ+1 and show that N i (1/ρi ) . (1 + 1/ρ)N ŪT,I (N ) U (w) ≥ 21 . We have w−1 N X X U(i) Ni U(i) Ni U(i) Ni = + . ρi (1 + 1/ρ)N ρi (1 + 1/ρ)N i=w ρi (1 + 1/ρ)N i=1 (3.12) From the concavity and non-decreasing nature of U(·) we have, U(k) ≥ (k/w)U(w) for k < w and U(k) ≥ U(w) for k ≥ w. Hence w−1 N X (i/w) Ni (1/ρi ) X Ni (1/ρi ) ŪT,I (N) ≥ + U(w) (1 + 1/ρ)N (1 + 1/ρ)N i=1 i=w N w−1 −1 N X (1 + ρ) Ni−1 (1/ρi ) X (1/ρi ) i + = (1 + 1/ρ)N (1 + 1/ρ)N i=w i=1 X N w−2 N −1 N i X (1/ρi ) (1/ρ ) 1 + ρ i i + . = N N (1 + 1/ρ) ρ (1 + 1/ρ) i=w i=0 The right hand side is further simplified by noting that for 0 ≤ i ≤ w − 2, and 1+ρ 1 1 N 1 N −1 N 1 ≥ ≥ , i ρ ρi 2 i ρi i ρi (3.13) N 1 1 N ≥ . w w−1 w ρ w−1 ρ (3.14) 28 We get w−2 N X (1/2) Ni (1/ρi ) X Ni (1/ρi ) ŪT,I (N) ≥ + U(w) (1 + 1/ρ)N (1 + 1/ρ)N i=0 i=w w−2 N N N i X X (1/ρ ) (1/ρi ) i i ≥ (1/2) + (1/2) (1 + 1/ρ)N (1 + 1/ρ)N i=0 i=w N N N w X (1/ρi ) (1/ρ ) i + (1/2) +(1/2) w (1 + 1/ρ)N (1 + 1/ρ)N i=w+1 w−2 N N N X X (1/ρi ) (1/ρi ) i i ≥ (1/2) + (1/2) (1 + 1/ρ)N (1 + 1/ρ)N i=0 i=w N N N X (1/ρw−1) (1/ρi ) w−1 i + (1/2) +(1/2) (1 + 1/ρ)N (1 + 1/ρ)N i=w+1 N N N N i X X (1/ρ ) (1/ρi ) i i ≥ (1/2) + (1/2) (1 + 1/ρ)N (1 + 1/ρ)N i=0 i=w+1 N N X (1/ρi ) 1 i ≥ 1/2 + (1/2) ≥ . N (1 + 1/ρ) 2 i=w+1 (3.15) The result follows from (3.15) and Theorem 1. Theorem 2 implies that with a threshold of N, the performance of the system will be at least 50% of the optimal performance over all policies. Our numerical studies show that the time-average utility of the system operating under IL model with a threshold of N is usually quite close to the optimal. However, the optimum threshold could, in general, be much less than N. The best threshold policy can be found by finding the maximum of ŪT,I (m) over all m ∈ {1, 2, 3, ..., N}, using the expressions (3.10)-(3.11). Theorem 2, in conjunction with Theorem 1, implies that ∗ ∗ the optimal threshold-based time-average utility, ŪT,I satisfies ŪT,I ≥ 21 ŪI∗ . It is possible to obtain a stronger bound on the performance of the optimal threshold policy for the IL model. The derivation of this bound uses results from the analysis of the CL model, which we discuss next. 29 3.4.3 Threshold Activation Policies for the CL Model Consider a threshold activation policy with parameter m. We assume that N is a multiple of m. Initially, all N sensors are fully charged, and m of these are activated, and the remaining N − m remain in the ‘ready’ state. It is easy to see that the sensors will become grouped into c = N m batches, each of size m, and always move through the different states in these batches. From our definition of a threshold activation policy, it follows that at most one batch can remain active at any time. Let SN denote the set of all factors of N, i.e., all positive integers which divide ∗ N. Then ŪT,C , the optimal threshold-based time-average utility for the CL model, ∗ is defined as ŪT,C = maxm∈SN ŪT,C (m). ∗ Next we state an important bound on ŪT,C . Theorem 3 The optimal threshold-based time-average utility for the CL model, N ∗ ), i.e., ŪT,C , is lower-bounded by 34 U( 1+ρ N 3 ∗ ). ŪT,C ≥ U( 4 1+ρ Proof: Using steady-state Markov chain analysis (for details, refer to [45]), the time-average utility of the system, ŪT,C (m), can be computed as ŪT,C (m) = U(m) 1 − ρc Pcc! ρi i=0 i! ! , (3.16) where c = N/m. Recall that ρ is an integer, and N is divisible by (1 + ρ). ∗ To prove the lower bound on ŪT,C , it is sufficient to show that there exists an N . In particular, we show that the result holds for m̂, such that ŪT,C (m̂) = 43 U 1+ρ m̂ = N , 1+ρ by considering two cases, namely ρ = 1 and ρ ≥ 2. Case 1 (ρ = 1) : Since ρ = 1, m̂ = N 1+ρ = N 2 and c = N m̂ = 2. Hence, ! N N 1+ρ U U = ŪT,C (m̂) = 1− 2 1+ρ 1+ρ 1 + ρ + ρ2 3 N N ≥ U . = 0.8U 1+ρ 4 1+ρ ρ2 P22! ρi i=0 i! ! 30 N 1+ρ ≤ N 1+ρ Case 2 (ρ ≥ 2) : Since ρ ≥ 2, m̂ = ŪT,C (m̂) = 1− ρc Pcc! ρi i=0 i! ! U N 3 ≥ Pc ρi i=c−3 i! ≥ 3. Hence, ρc c! 1 − Pc ρi i=c−3 i! Since c = (1 + ρ), we have ρc c! N m̂ and c = ! U N 1+ρ . (3.17) 1 = = 1+ c ρ 1+ 1+ρ ρ + c(c−1) ρ2 + + 1 (1+ρ)ρ ρ2 c(c−1)(c−2) ρ3 + (ρ+1)ρ(ρ−1) ρ3 = 1 4+ 2 ρ N 1+ρ − 1 ρ2 1 ≤ . 4 (3.18) From (3.17) and (3.18), we obtain ŪT,C (m̂) ≥ 1 1− 4 U N 1+ρ 3 = U 4 . (3.19) ∗ Theorem 3, together with Theorem 1, implies ŪT,C ≥ 43 ŪC∗ . Therefore, for the CL model, the performance of the best threshold policy is at least 3 4 of the optimal performance over all policies. The best threshold policy can be found by finding the maximum of ŪT,C (m) over all m ∈ SN , using the expression in (3.16). As we describe later, our numerical results show that this maximum is typically achieved at some intermediate value of m. From the proof of Theorem 3, it can also be shown that a threshold of 3.4.4 N 1+ρ achieves the lower bound of 43 . Comparison of IL and CL Models The following result states that for every threshold m, the performance for the IL model is at least as good as that for the CL model. Theorem 4 For any m ∈ SN , the time-average utility for the IL model, ŪT,I (m), can be no less than the time-average utility for the CL model, ŪT,C (m), i.e., ŪT,I (m) ≥ ŪT,C (m) . Proof: Figure 3.3 provides a queuing network representation of the IL model operating 31 Figure 3.3: Queuing network representation for the IL model Figure 3.4: Queuing network representation for the CL model with a threshold of m. It is easily seen that the Markov chain for this queuing network is identical to that of the IL model described earlier. Figure 3.4 provides a queuing network representation of the CL model operating with a threshold of m. Again, it can be seen that the Markov chain for this queuing network is identical to that of the CL model described earlier. To show that ŪT,I (m) ≥ ŪT,C (m) we construct the multi-class queuing network model of an intermediate system as shown in Figure 3.5. There are m classes of customers with c = N/m customers in each class. Each class of customer visits one of the exponential m servers in station 1 (active sensors) and one of the N/m exponential servers in station 2 (passive sensors). Note that the behavior of each class of customers in the network shown in Figure 3.5 is identical to that of the batches in the network representation of the CL model. Comparing networks in Figures 3.4 and 3.5 we note that the steady state probability of there being i customers at a particular server at station 1 in Figure 3.5 is equal to the steady state probability, πi,C (m), of there being i customers at 32 Figure 3.5: Queuing network model of intermediate system station 1 in Figure 3.4. Therefore, the throughput of each class of customers in the network in Figure 3.5 is equal to the throughput of the network in Figure 3.4 and is given by χC (m) = (1 − π0,C (m))µ1 . Next, we consider the network representation of the IL model (Figure 3.3). If πi,I (m) denotes the steady state probability of there being i customers at station 1 in Figure 3.3, then the throughput of the network is P Pm−1 iπi,I (m)µ1 + N given by χI (m) = i=1 i=m mπi,I (m)µ1 . Now we compare the networks in Figure 3.3 and Figure 3.5 and note that both networks have N customers in total and the same number of servers at stations 1 and 2. However, in Figure 3.5 the servers in stations 1 and 2 are dedicated to each class of customer, while the servers in Figure 3.3 are pooled. Then from the results presented in [23] and [66] on the impact of pooled servers on network throughput, it follows that the throughput of the network in Figure 3.3 is at least equal to (if not greater than) the overall throughput (summed over all classes) of the network in Figure 3.5, i.e., χI (m) ≥ mχC (m). (3.20) We use this result to show that ŪT,I (m) ≥ ŪT,C (m). Note that ŪT,I (m) = Pm−1 PN i=1 πi,I (m)U(i) + i=m πi,I (m)U(m) and ŪC,I (m) = (1 − π0,C (m))U(m). From 33 concavity of U(·) we have, U(i) ≥ (i/m)U(m) for i ≤ m. Hence ŪT,I (m) = ≥ m−1 X πi,I (m)U(i) + i=1 m−1 X N X πi,I (m)U(m) i=m (i/m)πi,I (m)U(m) + i=1 ≥ N X πi,I (m)U(m) i=m U(m)χI (m) . mµ1 (3.21) Similarly, ŪT,C (m) = (1 − π0,C (m))U(m) = U(m)χC (m) . µ1 (3.22) From (3.20), (3.21) and (3.22), it follows that ŪT,I (m) ≥ ŪT,C (m). Theorem 4 implies that the presence of correlation amongst the discharge and recharge times of sensors degrades system performance. Theorems 3 and 4 allow us to improve our earlier bound on the performance of the optimal threshold policy for the IL model. Corollary 5 The optimal threshold-based time-average utility for the IL model, N ∗ ), i.e., ŪT,I , is lower-bounded by 43 U( 1+ρ N 3 ∗ ). ŪT,I ≥ U( 4 1+ρ ∗ Corollary 5, in conjunction with Theorem 1, implies ŪT,I ≥ 43 ŪI∗ . Therefore, for both the IL and CL models, the performance of the best threshold policy is at least 3 4 of the best achievable performance. 3.5 Numerical Results In this section, we report results from numerical experiments performed on the threshold activation policies for the IL and CL models under different parameter settings. For the utility function U(n) = 1 − (1 − pd )n , we conduct numerical experiments for different values of pd (= 0.1, 0.9), N(= 16, 32, 48), and ρ(= 3, 7, 15). 34 0.19 0.18 Independent Lifetime Correlated Lifetime Upper Bound Time Average Utility 0.17 0.16 0.15 0.14 0.13 0.12 0.11 0.1 0.09 0 2 4 6 8 10 Threshold (m) 12 14 16 Figure 3.6: Time-average utility for IL and CL models (µ1 = 7, µ2 = 1, N = N 16, pd = 0.1). At the threshold of 1+ρ = 2, both the models achieve more that 75% performance. To obtain the different values of ρ, we set µ2 = 1 and vary µ1 . For each parameter setting, we compare the time-average utility of the system for different values of the threshold. 3.5.1 Performance under IL and CL models Figures 3.6 and 3.7 depict typical plots that describe the performance of thresh- old policies in the presence of low and high probability of detection (pd ) respectively. Note that the figures show the time-average utilities ŪT,I (m) and ŪT,C (m) along N ), the upper bound on the maximum achievable time-average utility. with U( 1+ρ Figures 3.6 and 3.7 indicate that for both the CL and IL models, the time-average utility is maximized at an intermediate value of the threshold. Further, the optimal threshold may be distinct from N . 1+ρ For the CL model, when operating with a threshold greater than the optimal, the time-average utility decreases very rapidly with the threshold value. However, for the IL model the decrease is gradual and in many cases marginal. The rapid decrease in performance of the CL model for thresholds other than the optimal emphasizes the need to model and understand impact of correlation on system performance. 35 1 0.9 Time Average Utility 0.8 Independent Lifetime Correlated Lifetime Upper Bound 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2 4 6 8 10 Threshold (m) 12 14 16 Figure 3.7: Time-average utility for IL and CL models (µ1 = 7, µ2 = 1, N = N 16, pd = 0.9). At the threshold of 1+ρ = 2, both the models achieve more that 75% performance. Figure 3.8 depicts the transient sensor system behavior. We note that the timeaverage utility converges to the steady-state value of 0.167 in around 100 seconds of simulation time. The number of active sensors is plotted at each decision instant, when the threshold of m = 2 is applied. The number of active sensors equals zero at around 3.7% of decision instants, equals one at around 11.7% of decision instants and equals two (i.e. the threshold of m = 2 is met) at around 84.6% of decision instants. Tables 3.1 and 3.2 list the ratio of the time-average utility obtained at the N ∗ ∗ ). Note that this optimal threshold (ŪT,I or ŪT,C ) to the lower bound of 43 U( 1+ρ ratio must lie between 1 and 43 . A value close to 1 indicates a tight lower bound, whereas a value close to 4 3 indicates that performance of the optimal threshold policy is close to the best achievable performance. Table 3.1 indicates that for low values of the probability of detection pd , the time-average utility obtained by the optimal threshold policy for the CL model is very close to the lower bound. However, performance of the optimal threshold N ). policy for the IL model is fairly close to the maximum achievable value U( 1+ρ 0.2 2 0.19 1.8 0.18 1.6 0.17 0.16 Independent Lifetime Upper Bound 0.15 0.14 0.13 Number of Active Sensors Time Average Utility 36 1.4 1.2 1 0.8 0.6 0.12 0.4 0.11 0.2 0.1 0 20 40 60 Simulation Time (seconds) 80 (a) Performance for m = 2 100 0 0 20 40 60 Simulation Time (seconds) 80 100 (b) Active sensors for m = 2 Figure 3.8: Transient sensor system behavior: Time average utility and number of active sensors over time for the IL model (µ1 = 7, µ2 = 1, N = 16, pd = 0.1) when threshold of m = 2 is applied. ρ=3 ρ=7 ρ=15 N=16 N=32 N=48 1.29 1.30 1.31 1.28 1.29 1.29 1.27 1.28 1.28 ρ=3 ρ=7 ρ=15 (a) IL Model N=16 1.06 1.14 1.22 N=32 N=48 1.06 1.06 1.09 1.09 1.16 1.17 (b) CL Model Table 3.1: Ratio of optimal threshold-based time-average utility and lower bound, for pd = 0.1. ρ=3 ρ=7 ρ=15 N=16 N=32 N=48 1.33 1.33 1.33 1.24 1.33 1.33 1.14 1.25 1.32 (a) IL Model ρ=3 ρ=7 ρ=15 N=16 1.31 1.21 1.14 N=32 N=48 1.32 1.33 1.32 1.33 1.21 1.31 (b) CL Model Table 3.2: Ratio of optimal threshold-based time-average utility and lower bound, for pd = 0.9. 37 0.2 Time Average Utility 0.18 Beta Gamma Uniform Upper bound 0.16 0.14 0.12 0.1 0 2 4 6 8 10 Threshold (m) 12 14 16 Figure 3.9: Time-average utility plot for various distributions of recharge and discharge times with same mean (µ1 = 7, µ2 = 1) and variance (= mean2 /3) Table 3.2 indicates that for high values of the probability of detection pd , the timeaverage utility obtained by the optimal threshold policy for both models is very N ), although the performance for the close to the maximum achievable value U( 1+ρ IL model is slightly better, as expected. In summary, in most cases, the optimal threshold policies yield performance that are very close to the maximum achievable performance. When this is not the case (for instance, for the CL model and low N ). Further, values of pd ), the performance is fairly close to our lower bound of 43 U( 1+ρ the numerical experiments also indicate that our bounds are fairly robust to the choice of N and ρ. 3.5.2 Distribution Independence Simulations were carried out for the IL model with recharge and discharge times randomly distributed under various different distributions, viz. Beta, Gamma, Uniform. The performance of threshold activation policies was measured by computing the time-average utility. It is observed that the performance of a threshold activation policy is independent of the distribution of the recharge and discharge times. This can be inferred by observing the performance of the threshold activation 38 policies under the various distributions with same mean and variance, as shown in Figure 3.9. 3.6 Summary In this chapter, we have considered a system of rechargeable sensors, and addressed the question of how sensors should be activated dynamically with the objective of maximizing a generalized global performance metric. In view of the computational difficulty in obtaining the optimal solution, we have studied the performance of simple threshold activation policies. We have shown analytically that for the case where all sensors cover the region of interest, the best threshold policy achieves a performance of at least 3 4 of the optimum. Numerical results show that in certain cases, the best threshold activation policy performs quite close to the global optimum. We also observe that the presence of correlation in the discharge and recharge times worsens system performance, particularly at large thresholds. CHAPTER 4 Distributed Activation Policies under Random Sensor Deployment In a realistic deployment scenario, sensor nodes may be deployed at random in a substantially large region of interest. Therefore, these nodes will cover the region of interest only partially. In addition, the coverage areas of two sensors may overlap only partially, or may not overlap at all (i.e. be disjoint). In this chapter, we extend threshold activation policy discussed in Chapter 3 to this general scenario. We refer to the case where all the sensors are able to cover the region of interest as the identical coverage case, and the case where the sensors cover the region of interest only partially, and may have partial or no overlap among each other’s coverage areas, as the partial coverage overlap case. The case of partial coverage overlap is very difficult to model and analyze, even for the special class of threshold policies. Therefore, we try to develop a solution heuristically, based on the insights obtained from the identical coverage case. We show, through extensive simulations, that our solution yields a performance trend similar to that observed in the previous case of identical coverage among the sensors. In particular, we observe that the performance achieved still satisfies the three-fourth bound with respect to the upper bound on the optimum over all policies. Section 4.1 describes the distributed threshold-based node activation algorithm developed for the case of partial coverage overlap. We derive an appropriate upper bound on the maximum achievable performance by any activation policy in this distributed setting in Section 4.2. Section 4.3 discusses the various sensor lifetime models considered to account for varying degree of spatial correlation among the sensors, and Section 4.4 presents simulation results obtained using the distributed threshold-based node activation algorithm in this general network of rechargeable sensors. 39 40 4.1 Distributed Node Activation Algorithm To motivate our distributed activation algorithm, let us assume that a sensor i wants to maintain a utility of U(mi ) per unit area per unit time in its coverage area, where mi is appropriately chosen. In other words, if the coverage area of the sensor is denoted by Ai , then the sensor targets to derive a utility of |Ai|U(mi ) per unit time. When the sensor is in the ‘ready’ state, then at any decision instant, the sensor computes the current utility per unit time in its coverage area. If the current utility is less than the targeted utility, then the sensor activates itself; otherwise, the sensor remains in the ‘ready’ state until the next decision instant. A sensor can compute the utility derived from its coverage area in the following manner. For a generic area element A ∈ Ai , let n(A, t) denote the number of sensors covering A at time t. Then the utility per unit time in the coverage area of sensor i is calculated as Z U(n(A, t)) dA . (4.1) Ai Assume that sensor i can communicate with all sensors whose coverage areas overlap with its own coverage area. Then the sensor can periodically poll these neighbors to know their activation state. Assuming that the sensor i knows the coverage patterns of those neighbors, it can compute the current utility by evaluating the expression in (4.1). Therefore, the proposed algorithm can be realized in a distributed setting based only upon local information. Algorithm 2 : Threshold based Activation (global threshold parameter m) for each sensor i in Ready state at decision instant t do Calculate the current utility (uC (t)) derived in the sensor i’s coverage area The targeted utility (uT (t)) equals |Ai|U(m) if uC (t) < uT (t) then sensor i activates itself else sensor i deactivates itself till the next decision instant end if end for Note that the algorithm is motivated by the threshold activation policy dis- 41 cussed in the Chapter 3, and in the case of identical coverage, it reduces to a distributed implementation of the threshold policy described in Chapter 3. In practice the decision interval needs to be chosen carefully to ensure that not too much energy is wasted in the ‘ready’ state by periodic wakeup and polling, while guaranteeing good performance. The thresholds mi can be defined globally or locally, and accordingly we have two variants of our activation policy: • Global threshold policy: In this case, the mi = m ∀i, where the fixed threshold m is chosen appropriately. • Local threshold policy: In this case, the mi can be different for each sensor i, depending on the local neighborhood of the individual sensors. The threshold based distributed algorithm with global threshold parameter is described in Algorithm 2. The algorithm for local threshold parameter is described in a similar manner. In the simulation performed later in the chapter, we assume all the time instants when the state of the system in the immediate neighborhood of a sensor changes, to be the decision instants for the sensor. In Section 4.4, we comment on the appropriate choice of the local and global thresholds, needed to yield optimum performance. We can intuitively expect the local threshold policy to perform better, particularly in scenarios where there is a high spatial variance in the density of sensor nodes in the deployment region. For the local threshold policy, sensors in areas with larger density can have a higher threshold, while sensors in a sparser region can set their threshold to a lower value. However, if the sensors are deployed more or less uniformly, then both these policies are observed to perform very well in simulations, although local threshold policy performs slightly better. 4.2 Upper Bound on Optimal Time-average Utility In this section, we derive an upper bound on the optimal time-average utility derived from a sensor network with partial coverage overlap. We assume that the mean discharge and recharge times of the sensors are given by 1 µ1 and 1 µ2 respectively, 42 and ρ = µ1 µ2 ≥ 1. We do not make any assumption on the distribution of the discharge and recharge times. Let A denote a generic area element in the physical space of interest, and N(A) denote the number of active sensors that cover area element A. Corollary 6 The optimal time-average utility for a general network of sensors is upper-bounded by Z A U N(A) 1+ρ dA . (4.2) The above result can be proved following the same line of analysis as in the proof of Theorem 1. Since the optimal policy is difficult to formulate and compute in this case, we will compare the performance of our distributed algorithm with respect to this upper bound. 4.3 Sensor Lifetime Models We consider five different sensor lifetime models for the distributed sensor net- work scenario. The first two models (independent and correlated lifetime models) are extensions of the IL and CL models considered for the identical coverage case in Chapter 3. The next two models (independent and correlated event-based lifetime models) are event-based. For these models, we assume that discharging and recharging depend on events that occur randomly in the deployment region (in a manner as described below). Finally, for the sake of comparison, we also consider a deterministic lifetime model, where the discharge and recharge times of each sensor are fixed. • Independent Lifetime Model: The discharge and recharge times of the sensors are exponential i.i.d. with means 1/µ1 and 1/µ2 respectively. • Correlated Lifetime Model: The discharge and recharge times of the sensors are exponentially distributed with means 1/µ1 and 1/µ2 respectively. However, the discharge (recharge) times of all sensors entering the ‘active’ (‘passive’) state at the same time is the same. And the discharge (recharge) times of 43 sensors entering the ‘active’ (‘passive’) state at different times are independent of each other. • Independent Event-based Lifetime Model: Events are assumed to occur randomly in the physical space of interest, and a sensor node gets discharged (by a fixed amount q) only when an event occurs within its coverage area. Events are assumed to occur according to a poisson process, and are uniformly distributed in the area of interest. A sensor node, on activation, is assumed to have a total energy of Q units. Therefore, an active sensor gets fully discharged once Q q events have occurred within its coverage area. The recharge process is modeled similar to the discharge process. A passive node gets fully recharged once a certain number of random “recharge events” have occurred in its coverage area. The mean inter-event times for discharging and recharging are chosen so that the mean discharge and recharge times of sensors equal 1/µ1 and 1/µ2 respectively. • Correlated Event-based Lifetime Model: The network is divided into imaginary blocks of equal sizes. As in the case of the independent event-based lifetime model, events occur according to a poisson process, and are assumed to be uniformly distributed in the area of interest. However, an event occurring anywhere in the block affects all the sensors located in this block in a similar manner. This introduces spatial correlation between the discharge and recharge times of the sensors located in the same block. The degree of spatial correlation depends on the sizes of the blocks. In this model too, the mean inter-event times are chosen so that the mean discharge and recharge times of sensors equal 1/µ1 and 1/µ2 respectively. • Deterministic Lifetime Model: The discharge and recharge times of the sensors are constant, and equal 1/µ1 and 1/µ2 respectively. 4.4 Simulation Results and Discussion The performance of the distributed node activation algorithm described in Section 4.1 is evaluated using simulations for a wide range of parameters for both 44 0.16 0.15 Time avg utility 0.14 0.13 0.12 Independent Lifetime Correlated Lifetime Independent Event Correlated Event Deterministic Upper bound 0.11 0.1 0.09 0.08 0 5 10 15 20 Global threshold(m) 25 30 Figure 4.1: Performance for different lifetime models with global thresholds. At the global threshold of m = N̄/(1 + ρ) ≈ 2, more than 75% performance is achieved for all lifetime models. 0.16 0.15 Time avg utility 0.14 0.13 0.12 Independent Lifetime Correlated Lifetime Independent Event Correlated Event Deterministic Upper bound 0.11 0.1 0.09 0.08 0 1 2 3 4 5 6 Local threshold parameter(α) 7 8 Figure 4.2: Performance for different lifetime models with local thresholds. At the local threshold of α = 1, more than 75% performance is achieved for all lifetime models. 45 0.16 0.15 Time avg utility 0.14 0.13 0.12 0.11 16 blocks 4 blocks 1 block Independent Event Upper bound 0.1 0.09 0.08 0 5 10 15 20 Global threshold(m) 25 30 Figure 4.3: Performance under varying degrees of event correlation with global thresholds. At the global threshold of m = N̄ /(1 + ρ) ≈ 2, more than 75% performance is achieved for all lifetime models. 0.16 0.15 Time avg utility 0.14 0.13 0.12 0.11 16 blocks 4 blocks 1 block Independent Event Upper bound 0.1 0.09 0.08 0 1 2 3 4 5 6 Local threshold parameter(α) 7 8 Figure 4.4: Performance under varying degrees of event correlation with local thresholds. At the local threshold of α = 1, more than 75% performance is achieved for all lifetime models. 46 the cases of global and local thresholds. In the representative simulation results presented here, the simulation setup and the parameters used are as follows. A total of N = 52 sensors, each having a circular coverage pattern of radius 12 units, are thrown uniformly at random in an area of size 50 × 50. With these parameters, the mean coverage of the network (N̄ ), defined as the average number of sensors covering any point in the deployment region, is observed to be approximately 9.1. We use ρ = 3; for the event-based lifetime models, Q/q = 100. For the correlated event based model, number of blocks is set to 4. The utility function used is U(n) = 1 − (1 − pd )n where pd = 0.1. Also, the upper bound on the maximum achievable utility in this case is calculated to be 0.159511, from (4.2). Figures 4.1 and 4.2 show the performance for the various models with global and local thresholds. Let us define α, the local threshold parameter, as α = ni mi /( 1+ρ ), where ni is the number of sensors (including sensor i) that cover the point where i is located. Note that in Figure 4.2, the time-average utility is plotted against this local threshold parameter α. From Figure 4.1, we observe that with a fixed global threshold of m = N̄ /(1+ρ) (≈ 2 in this case), the time-average utility achieved is greater than three-fourth of the upper bound. Similarly, from Figure 4.2, we observe that with a local threshold of mi = ni /(1 + ρ) (the case of α = 1), the time-average utility achieved is greater than three-fourth of the upper bound as well. For all of the event models considered, we see that this threshold value also achieves the close to best performance attained over all thresholds. Simulations performed for other network configurations yielded similar results. The performance for the deterministic model is close to the optimal for the threshold of mi = ni /(1 + ρ) and m = N̄ /(1 + ρ) for the local and global threshold policies respectively. Note that the Figures 4.1 and 4.2 also show that the performance at higher thresholds drops significantly as the degree of spatial correlation in the sensor recharge and discharge times increases. Figures 4.3 and 4.4 also demonstrate this fact more clearly. Note that the degree of spatial correlation increases as the number of blocks becomes smaller (i.e. with the increase in the block size). With the increase in spatial correlation, the performance drops significantly for threshold activation 47 policies at higher (global as well as local) thresholds. In this Chapter, we have demonstrated through simulations that if the threshold is appropriately chosen, our activation policy performs quite close to the global optimum, even in the general case where the sensor coverage areas could have complete, partial or no overlap with each other. CHAPTER 5 Node Activation in Partially Rechargeable Sensor Systems 5.1 Introduction The underlying sensor model studied in Chapters 3 and 4 is suitable where a sensor node needs to physically relocate to get recharged, i.e. once the sensor gets completely discharged, it needs to be taken to a recharging station where it would get recharged until the sensor batteries become fully charged. However, this sensor model has the following inherent restrictions: • Disjoint Recharge: The recharge and discharge intervals of a sensor node are disjoint. An active sensor gets discharged continuously without getting recharged, while a passive sensor node gets recharged without getting discharged. • Activation upon Complete Recharge: A sensor node can be activated only when it has been completely recharged. In many scenarios, renewable energy sources would be available to the sensing devices at their sensing or deployment location itself. Thus, a sensor can get recharged continuously, depending on the availability of the energy source. This allows the discharge and recharge of sensor batteries to take place simultaneously for an active sensing device. In addition, a sensor can be activated in the system to provide for additional utility, as long as it has some energy to sense. That is, a sensor node can be activated even when it is not completely charged and is only partially recharged. This leads us to a different sensor energy consumption model, namely the Partial Activation model, which we consider in this chapter. One of the objectives of the sensor deployment is to provide better quality of coverage in the region. Utilizing the redundancy of sensing devices in the network effectively so as to provide a better quality of coverage in the region, motivates the need to design efficient sensor node activation schedules in the system. In 48 49 this chapter, we consider the problem of optimal node activation in a partiallyrechargeable sensor system. We assume that the sensors get recharged continuously, according to a random process. These sensors activate (i.e. participate in sensing and transmission) themselves according to an activation policy employed in the system, and get discharged during the activation period, again according to another random process. The rate of discharge is typically higher than the rate of recharge for a sensor node. The decision question that we address here is: ”when and which sensors should be activated (or “switched on”) so that the time-average system utility is maximized”. The recharge processes across different sensors in close vicinity can be correlated, since the intensity of sunlight, for instance, is expected to exhibit a significant degree of spatial correlations. In addition, the events of interest, which the sensors would like to detect, may be correlated in space as well. That is, events of interest may occur simultaneously in a portion of the deployment area and would be detected by multiple, closely-located active sensors covering that area. In this chapter, we design node activation policies that utilize the redundancy of sensing devices in the network effectively so as to provide a better quality of coverage in the region. We model our rechargeable sensor system as a system of finite-buffer queues and show the existence of a simple threshold activation policy that achieves near optimal performance. Our results hold both in presence as well as absence of correlation in the discharge and/or recharge processes, thereby showing the robustness of such policies with respect to the degree of spatial correlation in the system. This chapter is organized as follows. Section 5.2 describes the partially- rechargeable sensor system model, presents the problem formulation and outlines our approach and contributions. Section 5.3 formulates the sensor activation problem as a stochastic decision question for the case where sensors have identical coverage areas, and describes the spatial correlation models. In Section 5.4, we argue that an aggressive (or greedy) activation policy does not perform well under most realistic scenarios, motivating the need to develop smarter activation policies. Section 5.5 describes threshold-based activation policies, discusses how such policies can be analyzed, and proves that a simple threshold policy achieves asymptotically opti- 50 Figure 5.1: Quantum-queue model of a sensor. energy bucket size. Here K is the sensor mal performance for the case of identical sensor coverages. Section 5.6 extends the threshold activation policies outlined in Section 5.5 to a more general scenario where sensor coverage areas may not overlap completely, and evaluates the performance of these policies through simulations. We summarize our analytical, numerical and simulation results in Section 5.7. 5.2 5.2.1 Model, Formulation and Contribution System Model and Formulation We assume that battery recharge at each of the sensors occurs in units of an energy quantum. Each sensor is modeled as a K-quantum bucket, i.e., a sensor battery can hold at most K quanta. Quanta arrive at the sensor according to a random process. During the activation period of a sensor, these accumulated quanta are used up one by one. Thus, a sensor can be modeled as a finite-buffer quantum-queue, as shown in Figure 5.1, with a buffer capacity of K. Note that this queue is discharged of quanta only when the sensor is activated. Therefore, the quantum service (or discharge) process depends on the chosen activation decision policy. We assume that the quantum arrival (recharge) process is poisson with rate λ and that the quantum service (discharge) times are exponentially distributed with mean 1 . µ Let γ = µ λ ≥ 1, since the recharge occurs typically at a slower rate than discharge. Consider a network of N sensors deployed redundantly to cover a region of interest. We assume that the performance of the system is characterized by a continuous, non-decreasing, strictly concave function U satisfying U(0) = 0, as in Chap- 51 ter 3. More specifically, U(n) represents the utility derived per unit area, per unit time, from n ≤ N active sensors covering an area element. Note that different sensors can be located at different points in the overall physical space of interest, and the coverage patterns of different nodes can be different. Therefore, the coverage areas of different sensors (i.e., the areas (regions) where the different sensors can provide coverage) will typically be different. This implies that at any time, utilities in different parts of the area of interest can differ significantly from one another. We are interested in maximizing the time-average utility of the system. Let A denote the entire area in the physical space of interest. Let nΠ (A, t) denote the number of active sensors that cover area element A at time t, under activation policy Π. Then the time-average utility under policy Π, is given by 1 lim t→∞ t Z tZ 0 U(nΠ (A, t)) dA dt . (5.1) A In Euclidean coordinates system, dA = dx dy, and nΠ (A, t) = nΠ (x, y, t), in the above expression. The decision problem that we consider in this chapter is that of finding the activation policy Π so that the objective function given by (5.1) is maximized. Clearly, sensors with no energy (i.e., sensors whose quantum-queue is empty) cannot be activated. Therefore, our decision problem is that of determining how many, and which, sensors to activate at any time, from the set of available sensors (i.e., sensors with non-zero energy). Note that if we activate more sensors, we gain utility in the short time-scale. However, if the number of active sensors is already large, since the utility function exhibits diminishing returns, we may want to keep some of the available sensors “in store” for future use. 5.2.2 Methodology and Contribution Due to the random nature of recharging and discharging processes, the ac- tivation question outlined above is a stochastic decision problem. Under certain assumptions this can be formulated as a semi-Markov decision problem; however, determining optimal policies for this problem can be computationally prohibitive. Since sensors are energy constrained we seek policies that can be implemented in a 52 distributed manner with minimal information and computational overhead. Therefore, we focus on simple threshold policies and examine their performance. We show that near-optimal performance can be achieved by choosing an appropriate threshold policy. To simplify the analysis and obtain fundamental performance insights, we examine the performance of threshold policies for a system of sensors whose coverage areas are identical. In other words, we first consider a scenario where each of the N sensors is able to cover the entire area (region) of interest. In this case the objective function in equation (5.1) reduces to a single integral over the time domain. Under Markovian assumptions, we analyze the performance of an aggressive activation policy (defined precisely later), and show that such a policy is not desirable under most practical scenarios. We then analyze a threshold policy for an appropriate choice of threshold, and show that the performance of this policy is provably nearoptimal under four different correlation models of the recharge/discharge processes. Our results demonstrate the robustness of threshold policies under spatial correlation across recharge and/or discharge processes. This motivates our study of the performance of threshold based policies in a general network setting, where coverage areas of different sensors could have partial, complete or no overlap with each other. Results from our extensive simulation studies show that even in this case, the performance of threshold activation policies is very close to the best achievable performance. It is worth noting here that the threshold policies that we study can be implemented in a large-scale sensor network in a distributed manner, with only local information. Therefore, our results show how performance close to the global optimum (which is computationally difficult to obtain) can be achieved with a simple activation policy using only local information. Note that this sensor system model is significantly different from that of Chapter 3, where the sensor’s recharge (discharge) process is modeled using an exponentially distributed recharge (discharge) time. In this chapter, we consider a completely different and more realistic energy consumption model for the sensors (in the form of a K-quantum bucket), and consider all the sensors which have a non- 53 zero quanta in their bucket as being available for activation. In other words, we allow partially recharged sensors to be activated, and show that this could improve the system performance significantly over what was observed in Chapter 3. The performance bounds that we obtain here are significantly stronger than those observed in Chapter 3. We show that threshold policies can achieve a performance that is asymptotically optimal, with respect to the buffer capacity K. Since K is expected to be fairly large (K represents the granularity of the energy recharge/discharge process) in most practical scenarios, the difference between performance of the threshold policy and that of the optimal policy is insignificant in almost all cases. Note that the eligibility of partially recharged sensors for activation plays a key role in this performance improvement. Also note that the consideration of the detailed energy model is partly necessitated by our desire to activate partially recharged sensor whenever appropriate. 5.3 Preliminaries In this section, we describe the discharge and recharge processes and the mod- els of spatial correlation considered. We also formulate the optimal node activation decision problem for the special case where all the sensors in the system have identical coverage areas. Finally, we upper bound the performance of any activation policy for identical coverage case, under all correlation models. 5.3.1 Recharge and Discharge Process Models We model the quanta arrival (recharge) process at each sensor by a poisson process with rate λ. Note that in a realistic sensing environment, the discharge time of a quantum could depend on various random factors. For instance, sensors can transmit information (resulting in energy usage) on the occurrence of “interesting” events, which may be generated according to a random process. Therefore in our system model, we assume that the quantum discharge times are random. More specifically, we assume that the discharge time of each quantum is exponentially distributed with mean µ1 . Note here that we assume µ ≥ λ, which captures the fact that the discharge rate of a sensor in the active state is no less than its recharge 54 rate. We assume that there is no energy discharge when a sensor is not active. Note that although a sensor can power itself off when it is not active, it has to wake up periodically and exchange messages with its neighbors to keep track of the system state in its neighborhood (so as to decide whether to activate itself or not). Therefore, in reality, one would expect that energy will be drained even when a sensor is not active (as long as it has non-zero energy), but probably at a fairly steady rate. However, the energy discharge rate in the non-active state can be expected to be much slower than the discharge rate in the active state, and is not considered in our analysis. 5.3.2 Identical Sensor Coverage We consider the scenario where all the sensors deployed in the network are able to cover the region of interest. In other words, we consider a system of N sensors covering the region of interest. Let nΠ (t) denote the number of sensors in the active state at time t under policy Π. Since all the sensors cover the region of interest, the optimization problem in the identical coverage case can be posed as that of finding a policy Π that maximizes Ū (Π), where Ū (Π) is defined as 1 Ū(Π) = lim t→∞ t Z t U(nΠ (t))dt . (5.2) 0 We assume that activation decisions can be taken at any instant of time. As we argue later in the chapter, these decisions need to be taken only when some active sensor runs out of energy, or when some sensor with zero energy becomes available by gaining a quantum through recharging. It is worth noting here that although we will address our decision question in the identical coverage case from the perspective of a centralized decision maker, the decision policy that we develop can easily be implemented in a decentralized manner. 5.3.3 Spatial Correlation Models As we discuss later in detail, the performance of decision policies depend con- siderably on how the discharge (and recharge) processes at the different sensors are 55 correlated. We consider two different extremes of correlation for both the recharge and discharge processes. We consider two models of the discharge times of the different sensors, namely the Independent Discharge (ID) and Correlated Discharge (CD) process models. In the former model, the quantum discharge times of the active sensors are independent of each other, while in the latter model, all active sensors get discharged of a quantum at the same time. The two correlation models can be practically motivated in the following way. To motivate the ID model, consider a scenario where data transmission (on the detection of interesting events) is the primary mode of energy expenditure, as is often the case in practice. Moreover, assume that the detection of an event and/or the subsequent data transmission is a random variable. This could happen, for instance, if a certain failure probability is associated with the detection of an event at each sensor, or if an event is reported only by one or a subset of the sensors (randomly chosen) detecting the event (to avoid redundant event reporting). In such scenarios, the discharge in the system is better represented by the ID model. To motivate the CD model, consider a scenario where all active sensors report data on a regular basis, or the data reporting is so infrequent that most of the energy expenditure occurs in sensing and processing. In these cases, the active sensors will all get discharged at the same rate, and therefore, the system is better modeled by the CD model. Similar correlation could occur during the recharging of the sensors as well. Due to the spatial vicinity of the sensor placement, recharge quanta may be received at all the sensor nodes at the same time leading to a Correlated Recharge (CR) process model. On the other hand, recharge quanta arriving independently at the different sensors is captured by the Independent Recharge (IR) process model. Note that these two models (for recharge and discharge) represent two extreme forms of correlation, and real-life situations can be expected to fall in between these two extremes. The results we obtain however suggest that the optimal threshold policy performs well in both extremes; therefore, this policy is expected to perform well in intermediate correlation scenarios as well. Nevertheless, we consider such intermediate correlation scenarios in Section 5.6 in our simulation based evaluation. 56 Considering all possible combinations of correlated and independent discharge and recharge processes provides us with four different sensor system models: • Independent Discharge Recharge (IDR) model: The discharge as well as the recharge processes at the different sensors are independent. • Correlated Discharge Recharge (CDR) model: The discharge as well as the recharge processes at the different sensors are completely correlated, i.e., recharge quanta arrive at the same time at all the sensors and these quanta get consumed, one by one, at the same time from all the active sensors. • Correlated Discharge Independent Recharge (CDIR) model: The discharge processes at the different sensors are completely correlated, while the recharge processes are independent. • Independent Discharge Correlated Recharge (IDCR) model: The discharge processes at the different sensors are independent, while the recharge processes are completely correlated. 5.3.4 Upper Bound on the Optimal Time-average Utility Since the optimal time-average utility is difficult to compute, we obtain an upper bound on it. We will evaluate and compare the performance of our proposed activation policies with respect to this bound. Note that the following result holds for all correlation models. Theorem 7 The Time Average Utility achieved by any node activation policy Π, Ū (Π), is upper bounded as Ū(Π) ≤ U N γ . Proof: The proof involves concavity arguments and Jensen’s Inequality [76]. Let f and p be measurable functions finite a.e. on R. Suppose, that f p and p are integrable R on R, p ≥ 0, and p > 0. If φ is convex in an interval containing the range of f , 57 then Jensen’s inequality states that: R R fp φ(f )p R φ R ≤ RR . p p R R Let n(t) denote the number of sensors that are active (and hence discharging) at time t. Since U(.) is concave, substituting φ = U(.), f = n(t) and p = 1 in the above, Jensen’s Inequality implies that: RT 0 U n(t)dt T ! RT 0 ≥ U(n(t))dt . T Since, U(.) is continuous, we have: RT 0 lim U T →∞ n(t)dt T ! RT 0 ≥ lim T →∞ U(n(t))dt . T Therefore, it suffices to show that lim U T →∞ RT 0 n(t)dt T ! ≤ U N γ . Define ψi (t) such that ψi (t) = 1 if sensor i is discharging at time t and ψi (t) = 0, otherwise. Then, continuity of U(.) also implies lim U T →∞ RT 0 n(t)dt T ! = U RT 0 lim T →∞ n(t)dt T ! =U lim R T Pi=N i=1 0 T →∞ ψi (t)dt T ! . Since ψi (t) is positive and bounded, lim U T →∞ RT 0 n(t)dt T ! = U lim T →∞ i=N X i=1 RT 0 ψi (t)dt T ! =U i=N X i=1 lim T →∞ Further, since all sensors are identical, for any sensor i, U lim T →∞ RT 0 n(t)dt T ! = U N lim T →∞ RT 0 ψi (t)dt T ! . RT 0 ψi (t)dt T ! . 58 Since the recharging is modeled as Poisson processes with parameter λ, the discharging times are assumed to be exponentially distributed with mean 1/µ, and each sensor is assumed to have an energy capacity equivalent to K quanta, the fraction of time a sensor is discharging corresponds to the steady state probability that the server in an M/M/1/K queue is busy. Therefore, we have T →∞ γ K −1 γ K+1 −1 ψi (t)dt γK − 1 ≤ K+1 . T γ −1 0 lim Since γ ≥ 1, we have RT ≤ 1 γ for finite K, implying lim RT 0 T →∞ ψi (t)dt 1 ≤ . T γ Therefore we have U lim T →∞ RT 0 n(t)dt T ! ≤ U N γ . This implies lim T →∞ RT 0 U(n(t))dt ≤ U T N γ . Thus the time-average utility under any policy can not exceed U( Nγ ). The result holds even when the random variables ψi (t) associated with different sensors i are not independent. 5.4 Aggressive Activation Policy In this section we define and analyze an aggressive (or greedy) activation policy. In the aggressive activation policy, ΠA , each sensor activates itself as soon as it has a non-zero energy level. In other words, all the sensors activate themselves whenever possible. 59 5.4.1 Analysis Let Ū (ΠA ) denote the time-average utility achieved by the aggressive activa- tion policy (AAP). Let us first consider a system with a single sensor node (N = 1), having a charge bucket size of K. Recall that the recharge rate equals λ, the discharge times of each quantum are exponentially distributed with mean γ = µ λ 1 , µ and ≥ 1. Under aggressive activation, the sensor can be analyzed using an M/M/1/K queue, where the sensor provides for utility of U(1) as long as its charge bucket is non-empty i.e. when there exists at least one quantum in the system. Let x denote the probability that there is no quantum in the system. Then, x=γ when γ > 1 and x = 1 K+1 K γ−1 . γ K+1 − 1 (5.3) when γ = 1. The time average utility of the system is given by, A Ū(Π ) = (1 − x)U(1) = γK − 1 γ K+1 − 1 U(1). (5.4) when γ > 1 and by A Ū(Π ) = (1 − x)U(1) = K K +1 U(1). (5.5) when γ = 1. Note that for the value of γ = 1, the aggressive activation policy is asymptotically optimal with respect to K, since the upper bound to achievable utility equals U(1) (from Lemma 7). We also observe (from (5.4)) that for γ > 1 and for large buffer size K, Ū(ΠA ) → U (1) γ = U (N ) . γ Next we consider the case of general N (N > 1) and show that both the above facts hold true in this case as well. We consider the IDR and CDR models presented earlier. • IDR Model : The behavior of each sensor can be analyzed in terms of independent M/M/1/K queues that provide for utility as long as their charge bucket is non-empty. The probability that the charge bucket is empty for any of the sensor nodes is given by (5.3). At any given time t, the probability that i out of the N sensors would be 60 active (i.e. i of the N queues would have non-zero quanta) is given by, N (1 − x)i x(N −i) . Q(i) = i (5.6) The steady-state time average utility achieved in the system is given by, Ū (ΠA ) = N X Q(i)U(i). (5.7) i=1 Expanding, and using the fact U (i) i ≥ U (N ) , N ∀i ∈ [1, . . . , N − 1], we have, N X N (1 − x)i x(N −i) U(i) i i=1 N X i N i (N −i) U(N) (1 − x) x N i i=1 # "N −1 X N − 1 j (N −1−j) (1 − x) x (1 − x)U(N) j j=0 K γ −1 U(N). (1 − x)U(N) = γ K+1 − 1 A Ū (Π ) = ≥ = = when γ > 1 and Ū(ΠA ) ≥ (1 − x)U(N) ≥ K K+1 (5.8) U(N) when γ = 1. • CDR Model : When the discharge and recharge processes at all the sensors are perfectly correlated, operating under the aggressive activation policy all the sensors get activated (and deactivated) at the same time. This system can be represented as one M/M/1/K queue providing a utility of U(N) in the system as long as the queue has non-zero quanta. Thus, A Ū(Π ) = (1 − x)U(N) = when γ > 1 and Ū(ΠA ) = K K+1 γK − 1 γ K+1 − 1 U(N). (5.9) U(N) when γ = 1. For the CDIR (and IDCR) model, since the recharge (discharge) processes are independent across the sensor nodes, an analysis similar to that used above can be applied to show similar results. 61 Thus, the time average utility for the aggressive activation policy for all the above sensor system models is lower bounded by (1 − x)U(N). Note that K represents the capacity of the energy bucket in terms of the unit of discharge/recharge and can therefore be assumed to be sufficiently large. Therefore, considering the asymptotic behavior as K → ∞, we have the following result. Lemma 8 The asymptotic time average utility for the aggressive activation policy satisfies: U(N) ≤ Ū (ΠA ) ≤ U γ N γ . Note that the lower bound derived here is tight, since for the CDR model the ) ). It is also interesting to observe performance achieved equals the lower bound ( U (N γ that as γ approaches one i.e. when the discharge and recharge rates become equal, the lower bound approaches the upper bound and the aggressive activation policy is asymptotically optimal w.r.t. K, from Lemma 7. In cases where γ ≫ 1, the lower bound given by Lemma 8 could be loose for the IDR model and hence we derive a different lower bound to Ū(ΠA ) in terms of fraction of maximum achievable utility. We assume that K ≥ logγ N. In the following result and its proof, we assume for simplicity of exposition that N is a multiple of γ. However, if N is not a multiple of γ, the result can be shown to hold with the lower bound replaced by 12 U(⌊ Nγ ⌋). Lemma 9 The time average utility for the aggressive activation policy for the IDR model is lower bounded as 1 N Ū (ΠA ) ≥ U( ). 2 γ Proof: Let w = N , γ such that 1 ≤ w ≤ N. Now from (5.7), and using the facts that U(i) ≥ U(w), ∀i ∈ [w, . . . , N], and A Ū (Π ) = N X U (i) i Q(i)U(i) = i=1 ≥ U(w) U (w) , w ≥ w−1 X ∀i ∈ [1, . . . , w − 1], we have, Q(i)U(i) + i=1 "w−1 X i i=1 w Q(i) + N X i=w N X i=w # Q(i) . Q(i)U(i) 62 Therefore, N w−1 X X Ū (ΠA ) N N iγ i (N −i) (1 − x)i x(N −i) (1 − x) x + ≥ i i U(w) N i=w i=1 N w−1 X X N N −1 (i−1) (N −i) (1 − x)i x(N −i) (1 − x) x (1 − x)γ + = i i−1 i=w i=1 N w−2 X X N N −1 j (N −1−j) (1 − x)i x(N −i) . (1 − x) x (1 − x)γ + = i j i=w j=0 In order to simplify the R.H.S. above, we use two useful inequalities. The first inequality is given by, for 0 ≤ j < w, 1 N (N −j) N − 1 (N −1−j) j x (1 − x) γ(1 − x) ≥ x (1 − x)j . j 2 j (5.10) To show that the above inequality holds, we need to show − j)γ(1 − that, (N x) ≥ 12 Nx. Since j < w, it suffices to show that γ ≥ 12 (1−x)NNx − N , or γ ≥ ( γ) K+1 xγ 1 . Simplifying, we get γ ≥ 12 γγK −1 , which is true when γ K ≥ 2. 2 (1−x)(γ−1) The other inequality we use is given by, N N w (N −w) (1 − x)(w−1) x(N −w+1) . (1 − x) x ≥ w−1 w (5.11) To show that the above inequality holds, we need to show that, (N +1−w)(1− x) ≥ wx, or (N + 1)(1 − x) ≥ w = N . γ Simplifying, we get, (γ K+1 − γ)(N + 1) ≥ N(γ K+1 − 1), which is true when γ K ≥ N. Thus, (5.10) holds for K ≥ logγ 2 and (5.11) holds for K ≥ logγ N. Note that K is the granularity of the energy bucket and hence can be assumed to be sufficiently large. Using (5.10) and (5.11), we have, 63 N w−2 X X Ū (ΠA ) N N −1 j (N −1−j) (1 − x)i x(N −i) (1 − x) x (1 − x)γ + ≥ i U(w) j i=w j=0 N w−2 X 1X N N j (N −j) ≥ (1 − x)i x(N −i) (1 − x) x + i 2 j=0 j i=w w−2 1 N 1X N j (N −j) (1 − x) x + (1 − x)w x(N −w) + ≥ 2 j=0 j 2 w N X 1 N N w (N −w) (1 − x)i x(N −i) (1 − x) x + i 2 w i=w+1 w−2 1 N 1X N j (N −i) (1 − x) x + (1 − x)(w−1) x(N −w+1) + ≥ 2 j=0 j 2 w−1 N X 1 N N w (N −w) (1 − x)i x(N −i) (1 − x) x ++ i 2 w i=w+1 N N 1X N 1 X N i (N −i) ≥ (1 − x) x + (1 − x)i x(N −i) 2 i=0 i 2 i=w+1 i ≥ 1 . 2 From Lemmas 8 and 9, we have the following result. Corollary 10 The asymptotic time average utility for the IDR model is lower bounded as A Ū(Π ) ≥ max 1 U 2 N γ U(N) , γ . From Lemma 8, we observe that AAP is near-optimal for both correlation models when γ = 1. For large values of γ, i.e. when the recharge rate is considerably smaller than the discharge rate, the performance may be far from optimal (as suggested by Lemma 8), the difference depending upon the shape of the utility function U. The performance degradation is likely to increase in the presence of correlation across the recharge/discharge processes at the different sensor nodes, as suggested by (5.8)-(5.9). For the IDR system model, however, the performance of AAP is at least 50% of the maximum achievable performance (from Corollary 10), for all values of γ. 64 1 1 Upper Bound (UB) IDR CDIR IDCR CDR 0.9 0.8 Time Average Utility Time Average Utility 0.8 0.7 0.6 Upper Bound (UB) IDR CDIR IDCR CDR 0.5 0.4 0.3 0.2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Detection Probability (pd) 0.8 0.6 0.4 0.2 0 0 0.9 2 4 6 8 10 12 Discharge−Recharge Ratio (γ) 1 1 0.9 0.9 0.8 0.8 0.7 0.6 Upper Bound (UB) IDR CDIR IDCR CDR 0.5 0.4 0.3 0.2 5 10 15 20 25 30 35 Number of Sensors (N) 16 (b) Varying γ Time Average Utility Time Average Utility (a) Varying pd 14 40 45 0.7 0.6 0.5 Upper Bound (UB) IDR CDIR IDCR CDR 0.4 0.3 50 0.2 1 10 (c) Varying N 2 10 3 4 10 10 Energy Bucket Size (K) 5 10 (d) Varying K Figure 5.2: Performance of Aggressive Activation Policy. The performance degrades sharply with presence of correlation across discharge and recharge processes. 5.4.2 Simulation Results We study the performance of aggressive activation policy under the various sensor system models for a range of system parameters. Figure 5.2 plots the typical performance trends. The system parameters used are given by λ = 1, µ = γ, N = 16, K = 100, pd = 0.5, γ = 4 (except when one of these varies on the x-axis). The utility function is given by U(n) = 1 − (1 − pd )n . We observe that the performance of AAP under various system models follows a definite and clear trend. We observe empirically that the IDR model results in the best performance, and the CDR model the worst. Furthermore, performance 65 under the IDCR model is worse than that under the CDIR model. This suggests that correlation in recharge processes across the sensor nodes has a greater effect on performance than correlation in discharge processes. We also observe that for the IDR system model, AAP performs quite close to the upper bound given by Lemma 7, demonstrating the tightness of bounds in Lemmas 7 and 8. Under all system configurations, AAP performance for the IDR model is ≥ 50% of the maximum achievable performance, as stated in Lemma 9. Even though AAP performs close to optimal under the IDR model, the performance degrades by orders of magnitude in the presence of correlation in the discharge/recharge processes across the sensor nodes. This motivates the need to develop efficient activation policies which would be robust to the presence of such correlations. From Figure 5.2(a), we observe that as the detection probability pd increases, the difference in AAP performance and the upper bound increases, particularly with the presence of correlated discharge/recharge. Figure 5.2(b) reiterates the fact that when γ = 1, AAP performs optimally under all the system models. However, as the recharge rate decreases in comparison to the discharge rate (increase in γ), the performance degrades consistently, more so with the presence of correlations. Figure 5.2(c) shows the performance of AAP with increase in the number of sensor nodes covering the region of interest. In general, the performance of AAP increases in the presence of redundant sensor nodes, but falls increasingly short of the maximum achievable performance as N increases (except for the IDR model). Figure 5.2(d) depicts the performance of AAP while varying the sensor energy bucket size K. We observe that even multiple orders of increase in K does not lead to a better performance for AAP under any of the system models. This is so because, operating under AAP the sensors try to activate themselves whenever possible, and under restricted recharge rates they end up operating with minimal energy level for most part of their operation. In other words, a sensor’s energy bucket almost never reaches K and hence increasing K does not impact the performance positively. A sensor operating under a smart activation policy is likely to utilize the larger size of its energy bucket to extract performance benefits, as is the case with the optimal threshold based policy, which is considered in Section 5.5. 66 Note that the lower bound U (N ) γ on AAP performance given by Lemma 8 is the same as the performance of AAP under CDR system model (from (5.9)), and hence is not plotted separately in the above figures. Also note that for large values of N and sufficiently large values of pd , U(N) → 1, and AAP performance under the CDR system model, given by U (N ) , γ converges to γ1 . We observe that AAP does not perform well under the presence of correlations, and is not desirable under most realistic system scenarios. Therefore, we look at threshold based activation policies next, and show that there exists an activation policy which performs asymptotically optimal w.r.t. K under all system models and parameters. 5.5 Threshold Activation Policies A threshold activation policy with parameter m is characterized as follows. An available sensor (i.e., a sensor with non-zero energy) s is activated if the number of active sensors in the system is less than the threshold m; otherwise, sensor s is not activated. Thus a threshold policy with a threshold of m tries to maintain the number of active sensors in the system as close to m as possible. However, the number of active sensors never exceeds m (refer Algorithm 1). Once a sensor gets activated, it remains active until it gets completely discharged. Once completely discharged, the sensor stays inactive until it receives a recharge quantum, at which time the sensor moves to available state. In this state, a sensor gets activated as soon as the number of active sensors in the system falls below the specified threshold m. Note that the AAP activation algorithm corresponds to a threshold activation policy with the threshold parameter m = N. For the identical sensor coverage case, the time-average utility of the system can be computed by tracking the number of active sensors over time. Consider the IDR system model with N sensors, each having a recharge rate of λ and a discharge rate of µ, under a threshold activation policy with parameter m. Such a system can be represented as a system of N queues with m exponential servers, where each queue represents a sensor node. The arrival rates each of the N queues equal λ, and the service times of all the m servers are exponentially distributed with mean 1 . µ Each server selects a queue with non-zero quanta and serves it until the queue 67 (a) IDR Model (b) CDR Model Figure 5.3: Queuing System representation of sensor system under Threshold based Activation becomes empty (i.e., the sensor gets discharged completely), after which the server tries to select another non-empty queue which is not being served (if available), representing the activation of an available sensor node and so on. Note that the switch-over time of the server is zero. Thus, this queueing system captures the behavior of the sensor system under threshold activation policy with parameter m. The number of active sensors in the system at any given time corresponds to the number of servers which are busy at that time. Figure 5.3(a) depicts the equivalent queueing system for the IDR model. In practical scenarios, due to the proximity in sensor node placements, the discharge and/or recharge processes at different sensors could be correlated. A perfect correlation in the discharge and the recharge processes at the sensors, along with threshold-based activation leads to batch activation and deactivation of these sensors. In this case, for a threshold of m, the system can be represented as a polling N queues (assuming c is an integer) and one exponential server, system with c = m where each of these queues represents a group of m sensors each. The arrival rates at each of the c queues equal λ, and the service time at the server is exponentially distributed with mean µ1 . Again, the server selects one of the non-empty queues and serves it until the queue becomes empty, at which time the server selects another non-empty queue (with zero switch-over time) and so on. The number of active 68 sensors in the system is given by m when the server is busy, and zero otherwise. This system of queues corresponds to the CDR system model and is depicted in Figure 5.3(b). Note that if m is not a factor of N, the CDR system model with a threshold of m can not be represented by Figure 5.3(b). However, for analytical tractability, we only consider those policies which apply a threshold of m such that m is a factor of N. Equivalent queueing system models, similar to those corresponding to the IDR and CDR models, can be formed for the IDCR and CDIR models as well. 5.5.1 Analysis of Threshold Policies Consider a system comprising of N sensor nodes. Each of the sensors is subject to a recharge process which is poisson with rate λ. When activated, a sensor is also subject to a discharge process which is poisson with rate µ (refer to Figure 5.1). For simplicity of analysis, we assume that γ = µλ is an integer and a factor of N. Let ΠT (m) denote the threshold activation policy utilizing a threshold parameter of m, and let Ū ΠT (m) denote the time-average utility achieved by this threshold activation policy. We analyze the performance of threshold activation policy for a particular choice of threshold and show that such a policy achieves near optimal performance for all the system models described earlier. Particularly, we show that the activation policy with a threshold of N γ is asymptotically optimal with respect to the bucket size K, i.e. the time average utility achieved approaches the maximum achievable utility as K → ∞ for all the sensor system models. Theorem 11 Threshold activation policy employing a threshold of N γ achieves asymp- totically optimal performance under all the sensor system models, i.e. N N T →U as K → ∞. Ū Π γ γ Proof: We consider each of the sensor system models individually and prove the performance result for each model. 69 • IDR (Independent Discharge Recharge) Model : Figure 5.3(a) represents the IDR sensor system operating under the threshold policy ΠT (m). Let us call this system S and denote its performance by Ū ΠTIDR (m) . Note that we are considering the threshold of m = N . γ Consider any work conserving scheduling discipline W . Operating under the discipline W , a server in the system never idles as long as there is a queue with non-zero quantum, which is not being served currently. Let V S (i, t) denote the probability that i out of m servers are busy at time t in system S. Then, the number of active sensors in the system at time t equals V S (i, t). Let V S (i) denote the steady-state probability of i servers being busy in system S. Then the time average utility achieved in system S is given by, Ū ΠTIDR (m) = m X V S (i)U(i). (5.12) i=1 Let us constrain system S such that server j (1 ≤ j ≤ m) can only serve queues from the set {(j − 1)c + 1, . . . , jc}, where c = N m = γ. We refer to this new system as S ′ . Note that unlike system S, in system S ′ a server may remain idle even when there exists a non-empty queue which is not being served currently. Since, in system S, there is a non-zero probability of server j serving a queue other than those belonging to the set {(j − 1)c + 1, . . . , jc} at some time, we have, ′ V S (i) ≥ V S (i), ∀i ∈ {1, . . . , m}. (5.13) The above result can also be deduced from known results about pooling of servers [12, 71]. It has been shown [12] that pooling a group of servers increases the expected throughput, when compared to an equivalent number of servers working separately. [71] particularly shows that system efficiency increases when separate traffic systems are combined into a single system, for the case when servers have equal service rates. Although the above results have been shown for infinite buffer sizes, they can be shown to hold for finite buffer sizes as well. 70 Figure 5.4: IDR Modification Models Now, let us superimpose the arrival processes at these groups of c queues to form system S ′′ . Figure 5.4 depicts the systems S, S ′ , and S ′′ . Since the arrivals at the c queues are independent poisson processes with rate λ, their superposition is also a poisson process with rate cλ [28]. The system S ′′ is composed of m independent M/M/1/K queues, each with an arrival rate cλ and service rate µ. Since the buffer space in system S ′ is c times that in system S ′′ , where c = γ ≥ 1, the overall number of arrivals that get served in system S ′ is at least as much as that in system S ′′ . Therefore, we have, ′ ′′ V S (i) ≥ V S (i), ∀i ∈ {1, . . . , m}. (5.14) In system S ′′ , since the m queues are independent of each other, and since cλ = µ, the steady-state probability for any of these m queues being empty 1 . K+1 Therefore, the probability that i out of m queues are ′′ non-empty (which is the same as V S (i)) is given by mi (1 − x)i x(m−i) . The is given by x = time average utility achieved in system S ′′ is given by, 71 m X V S ′′ (i)U(i) = i=1 ≥ = = m X m (1 − x)i x(m−i) U(i) i i=1 m X i m i (m−i) (1 − x) x U(m) i m i=1 # "m−1 X m − 1 (1 − x)j x(m−1−j) (1 − x)U(m) j j=0 K N (1 − x)U(m) = . (5.15) U K +1 γ The inequality above follows since, U (i) i ≥ U (m) , m ∀i ∈ [1, . . . , m]. Thus, the time average utility for the IDR model (system S) satisfies, Ū ΠTIDR (m) = m X S V (i)U(i) ≥ i=1 m X i=1 V S ′′ K U (i)U(i) ≥ K +1 N (5.16) . γ • CDR (Correlated Discharge Recharge) Model : Figure 5.3(b) represents the CDR sensor system operating under the threshold policy ΠT (m). The queue queues, each having an arrival rate of λ, ing system is comprised of c = N m and one exponential server with service rate µ. Let us call this system C. Under any work conserving scheduling discipline W , the server remains busy as long as there is at least one quantum in any of the c queues, providing for a utility of U(m) in the system as long as the server remains busy. Let the steady-state probability of the server being idle in system C be denoted as xC . Then the time average utility of the system is given by Ū ΠTCDR (m) = (1 − xC )U(m). Let us consider the case when m = N , γ (5.17) and c = γ. Let us replace all the c finite buffers (in system C) by one finite buffer of capacity K to form the system C ′ . Thus system C ′ has a batch arrival process with rate λ and batch size c 72 Figure 5.5: CDR Modification Models (denoted by λc in Figure 5.5) and is represented as M c /M/1/K. Since the buffer space in system C is c times that in system C ′ , where c = γ ≥ 1, the overall number of arrivals that get served in system C is at least as much as that in system C ′ . Therefore, we have, ′ xC ≤ xC . (5.18) Now, let us reduce the batch size (in system C ′ ) from c to 1, reduce the buffer capacity from K to K , c and reduce the service rate from µ = cλ to µ = λ, to form the system C ′′ . Figure 5.5 depicts the systems C, C ′ , and C ′′ . Note that the arrival rates in both the systems C ′ and C ′′ are the same and equal to λ. In system C ′′ , mean time to serve a quantum equals λ1 . Similarly, in system C ′ , mean time to serve a batch of c quanta equals λ1 . In system C ′′ , an arrival when the buffer is full would lead to the quantum being dropped. However, system C ′ would allow a part batch acceptance strategy [52], wherein a part of the batch would be enqueued when the available buffer space is less than c. The overall number of arrivals that get served in system C ′ is at least c times as much as that served in system C ′′ . Therefore, ′ ′′ xC ≤ xC . (5.19) 73 The system C ′′ is an M/M/1/ Kc system with λ = µ. Therefore, ′′ xC = Thus, xC ≤ γ , K+γ Ū γ c = . K +c K +γ (5.20) and the performance of the CDR model satisfies, ΠTCDR (m) K U = (1 − x )U(m) ≥ K +γ C N γ . (5.21) Discussion: Note that the server idle probability for the M c /M/1 system equals π0 = 1 − λc µ c [28]. Thus π0 → 0 when c = γ. However, as noted in [52], for the M /M/1/K system, the steady state probabilities may not be expressed in closed form. Intuitively, one would expect that the server idle probability for the M c /M/1/K system would be of the order ∼ 1 K+1 when c = γ, since π0 → 0 as K → ∞. Also, the P -K equation [28] in the M c /M/1/K system for state i such that c ≤ i < K is of the form πi (λ + µ) = πi+1 µ + πi−c λ. Empirically, the steady state values for all these probabilities equal 1 K+1 for i ≫ c, which also satisfies the P -K equations above. In addition, from empirical results we find that π0 ≤ 1 . K+1 However, our motivation here is not to solve the M c /M/1/K system exactly, but to show that the equivalent sensor system provides asymptotically optimal performance. We note that, 1 , and through a rigorous analysis it might be possible to show that xC ≤ K+1 K K U Nγ , nevertheless the bound of K+γ suffices for the Ū ΠTCDR (m) ≥ K+1 proof of our claim. • IDCR (Independent Discharge Correlated Recharge) Model : IDCR model is similar to the IDR model, except that the recharge processes at various sensor nodes are completely correlated. Performing modifications to the original system in a manner similar to that for the IDR model, we get system S ′′ where each of the queues has a batch arrival process with rate λ and batch size c (instead of a poisson arrival process with rate cλ as in figure 5.4). From the analysis of the CDR model above, we know that the server idle probability in each of these m(= N ) γ queueing systems satisfies x ≤ γ . K+γ Let 74 Yi , i ∈ 1, . . . , m denote the steady-state probability that the server i is busy. K That is, Yi = 1 − x ≥ K+γ . The expected number of servers that are busy is Pm given by E[ i=1 Yi ] = m(1 − x). (Note here that Yi’s are not independent.) The time average utility achieved in system S ′′ is given by m X V S ′′ (i)U(i) = i=1 ≥ m X i=1 m X P r[i out of m servers are busy]U(i) P r[i out of m servers are busy] i=1 i m U(m) U(m) Expected number of servers that are busy m N K . (5.22) U = (1 − x)U(m) ≥ K +γ γ = The former inequality above follows since, U (i) i ≥ U (m) , m ∀i ∈ [1, . . . , m]. Thus, the time average utility for the IDCR model satisfies, Ū ΠTIDCR (m) ≥ m X i=1 V S ′′ K (i)U(i) ≥ U K +γ N γ . (5.23) • CDIR (Correlated Discharge Independent Recharge) Model : CDIR model is similar to the IDR model, except that the discharge processes at various sensor nodes are completely correlated. That is, all the active sensors get discharged of an energy quantum at the same time. The time to discharge one quantum, however, is still exponentially distributed with mean µ1 . Performing modifications to the original system in a manner similar to that for the IDR model, we get system S ′′ where each of the queues has a poisson arrival process with rate cλ as in Figure 5.4. The server idle probability at each of these m queues equals x = 1 , K+1 model, we have, since c = γ. Using analysis similar to that of the IDCR 75 Ū ΠTCDIR (m) ≥ m X V S ′′ i=1 K (i)U(i) ≥ (1 − x)U(m) = U K +1 Note that the performance ratios K , K K+1 K+γ N (5.24) . γ approach unity as K becomes large. Therefore the threshold activation policy with a threshold of m∗ = N γ is asymptoti- cally optimal with respect to K, for all the different system models. It is also worth noting that the threshold of m∗ = N γ is the energy balancing threshold for the sensor systems. With m∗ sensors active in the system at any time, the average discharge rate in the system equals m∗ µ = N µ γ = Nλ which is the average recharge rate in the system. 5.5.2 Simulation Results We study the performance of threshold activation policies under the various sensor system models for a range of system parameters. The utility function used is U(n) = 1 − (1 − pd )n . We choose the following sets of parameters N = 16, K ∈ {10, 100}, γ ∈ {2, 4, 8}, and pd ∈ {0.1, 0.5}. The results obtained for other values of these parameters are similar in nature. For each parameter setting, we compare the time-average utility achieved in the system for different values of the threshold parameter (m). Figure 5.6 plots the time average utilities Ū ΠT (m) for the four sensor system models against the threshold parameter m. The figures also show: 1. U Nγ , the upper bound on the maximum achievable time average utility. 2. K U K+1 N , the lower bound on the achieved utility for threshold policy with γ a threshold of m∗ = N . γ Figures 5.6(a) - 5.6(b) depict performance for large bucket size and small detection probability pd . Figures 5.6(c) - 5.6(d) show performance variation for large detection probability pd . Figures 5.6(e) - 5.6(f) show performance for small bucket size K. 76 0.35 0.19 0.18 Time Average Utility Time Average Utility 0.3 0.25 Upper Bound (UB) IDR CDIR IDCR CDR UB*K/(K+1) 0.2 0.15 0.1 0 2 4 6 8 10 Threshold (m) 0.17 0.16 0.15 0.14 Upper Bound (UB) IDR CDIR IDCR CDR UB*K/(K+1) 0.13 0.12 0.11 12 14 0.1 0 16 (a) γ = 4, pd = 0.1, K = 100 2 4 6 8 10 Threshold (m) 12 14 16 (b) γ = 8, pd = 0.1, K = 100 1 1 0.9 0.9 Time Average Utility Time Average Utility 0.8 0.7 0.6 Upper Bound (UB) IDR CDIR IDCR CDR UB*K/(K+1) 0.5 0.4 0.3 0.2 0 2 4 6 8 10 Threshold (m) 0.8 0.7 0.5 12 14 0.4 0 16 0.35 0.2 0.3 0.18 0.25 Upper Bound (UB) IDR CDIR IDCR CDR UB*K/(K+1) 0.15 2 2 4 6 8 10 Threshold (m) 12 14 (e) γ = 4, pd = 0.1, K = 10 6 8 10 Threshold (m) 12 14 16 0.16 0.14 Upper Bound (UB) IDR CDIR IDCR CDR UB*K/(K+1) 0.12 0.1 0.1 0 4 (d) γ = 2, pd = 0.5, K = 100 Time Average Utility Time Average Utility (c) γ = 4, pd = 0.5, K = 100 0.2 Upper Bound (UB) IDR CDIR IDCR CDR UB*K/(K+1) 0.6 16 0 2 4 6 8 10 Threshold (m) 12 14 16 (f) γ = 8, pd = 0.1, K = 10 Figure 5.6: Performance of Threshold Policies. At the threshold of Nγ , all sensor system models achieve asymptotically optimal performance w.r.t. K. 77 We observe that the performance of threshold policies under various sensor system models follows a trend similar in nature to that observed for AAP performance in Section 5.4.2. First, the performance of threshold policies under the IDR model is not too far from optimal for sufficiently large values of threshold parameter, particularly when N γ ≤ m ≤ N. Secondly, we observe that the performance of threshold policies degrades in the presence of correlation in recharge/discharge processes across the sensor nodes. The degradation is more apparent at higher values of thresholds, i.e. when m > N . γ Third, from the simulation results, we observe empirically that at all values of the threshold parameter m, the IDR model results in the best performance, and the CDR model the worst. Furthermore, at all values of m, performance under the IDCR model is worse than that under the CDIR model. This suggests that correlation across recharge processes has a greater performance impact in comparison with that across discharge processes. Most importantly, the figures show that the performance of the threshold policy is maximized at the threshold of m∗ = N γ or close to it. Moreover, the time average utility at the threshold of Nγ is very close to that of the upper bound on the achievable performance, U Nγ , particularly for large K. The time average utility K at this threshold is also greater than the lower bound K+1 U Nγ for all the sensor system models, as expected from Theorem 11. Finally, note that we have so far assumed in our analysis and simulations that N is a multiple of γ. If this is not the case, then thresholds of ⌊ Nγ ⌋ or ⌈ Nγ ⌉ can be used. Performance under these thresholds could be sub-optimal, but expected to be close to the upper bound nevertheless. A more appropriate strategy in this case, however, would be to randomly choose the threshold between ⌊ Nγ ⌋ or ⌈ Nγ ⌉ in a randomized manner, so that the time average of the thresholds chosen corresponds to N . γ 5.6 Activation Policies in a General Network Scenario In a realistic deployment scenario, nodes may be deployed at random, and would cover different areas in the physical space of interest. In other words, the coverage areas of two sensors may overlap only partially, or may not overlap at 78 all. In this section, we extend our threshold activation policy to this very general scenario. This partial coverage overlap scenario is very difficult to model and analyze, even for the special class of threshold policies. In this section, therefore, we will try to develop a solution heuristically, based on the insights obtained for the identical coverage case. We develop a threshold activation algorithm that can be implemented in a distributed manner in a general network. We then show through simulations that our solution yields a performance trend that is similar to that observed in the identical coverage case. More specifically, for an appropriately chosen threshold, our threshold activation algorithm results in near-optimal performance even in this general network scenario. 5.6.1 Distributed Threshold Activation Algorithm To motivate our distributed activation algorithm, let us assume that mi is the target threshold for sensor i. In other words, sensor i wants to maintain a utility of U(mi ) per unit area per unit time in its coverage area. Specifically, if the coverage area of the sensor is denoted by Ai , then the sensor targets to derive a utility of |Ai|U(mi ) per unit time. When the sensor is available (i.e., has non-zero energy), then at any decision instant, the sensor computes the current utility per unit time in its coverage area. If the current utility is less than the targeted utility, then the node activates itself; otherwise, the node remains in the available state until the next decision instant (refer Algorithm 2). A sensor can compute the current utility derived from its coverage area in the following manner. For a generic area element A ∈ Ai, let n(A, t) denote the number of active sensors covering A at time t. Then the utility per unit time in the coverage area of node i is calculated as Z U(n(A, t)) dA . (5.25) Ai Assume that node i can communicate with all nodes whose coverage areas overlap with its own coverage area. Then the sensor can periodically poll those neighbors to know their activation state. Assuming that the sensor i knows the 79 coverage patterns of those neighbors, it can compute the current utility by evaluating the expression in (5.25). Therefore, the proposed algorithm can be realized in a distributed setting based only on local information. In practice the decision interval needs to be chosen carefully to ensure that not too much energy is wasted in the available state by periodic wakeup and polling, while guaranteeing good performance. Note that the algorithm is motivated by the threshold activation policy discussed in the previous section, and in the case of identical sensor coverages, it reduces to a distributed implementation of the threshold policy described earlier. Also, the targeted threshold mi may be different for different sensors depending upon the density of deployment of sensor nodes in each sensor’s coverage area. Next, we elaborate on the choice of threshold for a sensor node. 5.6.2 Choice of Threshold The threshold mi for each sensor i depends on the local neighborhood of the sensor i, where the local neighborhood includes all other sensors whose coverage area overlaps with that of sensor i. For each generic area element A ∈ Ai , let N(A) denote the total number of sensors covering A. Then the sensor i would like to maintain a threshold of N (A) γ in this area element. In this case, the overall targeted utility per unit time in the sensor’s coverage area Ai is given by Z U Ai N(A) γ |A| dA . (5.26) In order to evaluate the network performance for a range of thresholds, we introduce a local threshold parameter. If sensor i employs a local threshold parameter of α, its targeted utility is given by N(A) U α |A| dA . γ Ai Z The value of α = 1 corresponds to the threshold of m = (5.27) N γ in the identical coverage case. Similarly, a value of α ≥ γ corresponds to the aggressive threshold of m = N. However, note that some of the invariants from the identical coverage case 80 may not be satisfied in this general network scenario. For instance, even though all the sensors employ a local threshold of α, it is possible that more than α N (A) sensors γ are active in some area element A at some time. This is because each sensor takes into account all the area elements in its coverage area to compute the current utility, while making activation decisions (refer (5.25)). And depending upon the coverage patterns and the density of the sensor nodes in the network, it is possible that a few area elements in Ai are covered by sufficient number of active sensors, whereas many others have insufficient coverage. This would lead to, the sensor activating itself in order to increase the utility derived from the poorly covered area elements, in effect increasing the number of active sensors covering some area elements beyond the targeted threshold. Similarly, it is also possible that an area element has less active sensors covering it, while an available sensor which covers the area than α N (A) γ element decides not to activate itself. Even in the presence of such discrepancies, one would expect that all the sensors employing a threshold parameter of α close to 1 would lead to good network performance. We evaluate the performance achieved in the network for various values of α in Section 5.6.4. Since the optimal policy is difficult to formulate and compute in this case, we will compare the performance of our algorithm with respect to an upper bound. Let A denote the entire area in the network, and N(A) denote the number of sensors that cover area element A ∈ A. The following result can be proved using the same line of analysis as in the proof of Lemma 7: Corollary 12 The optimal time-average utility for a general network of sensors is upper-bounded by Z A U N(A) γ dA . (5.28) Next, we describe the various mechanisms developed to introduce spatial correlations in the discharge/recharge processes across the sensor nodes in the network. 81 5.6.3 Discharge and Recharge Event Models We assume that the discharge and recharge processes at the sensors are trig- gered by the occurrence of discharge and recharge events respectively. An event drops at an area element in the network and affects sensors which lie in the vicinity of the area element. For instance, if an interesting phenomena occurs somewhere in the network, it needs to be detected by one or more sensors, and then communicated to the sink for processing. Thus the interesting phenomena leads to some amount of discharge at the sensors which detect the phenomena. Similarly, a renewable recharge source like sunlight can be modeled using recharge events dropping at area elements in the network. We assume that the recharge and discharge events occur randomly at the area elements spanning the entire network. The rates of these events are configured so as to provide each sensor with a discharge rate to recharge rate ratio of γ. The amount of spatial correlation in the recharge and discharge processes at the different sensor nodes is modeled using the following event models: 1. Independent Discharge Recharge Event Model: Events occur randomly in the physical space of interest, and an active sensor node gets discharged by a quantum only when an event occurs within its coverage area. Events are assumed to occur according to a Poisson process, and are uniformly distributed in the area of interest. Thus an active sensor gets discharged of one quantum of energy if an event occurs within its coverage area, otherwise not. The recharge process is modeled similar to the discharge process. Thus recharge events occur randomly in the space of interest, and a sensor gets recharged by one quantum if a recharge event occurs within its coverage area. Note that even though this model is named the IDR event model, the recharge/discharge processes at the sensors are not independent since the sensors have overlapping coverage areas. However, the degree of correlation in this model is smaller in comparison to the other models described below. 2. Block-Correlated Event Models: Here the network is divided into virtual blocks of equal sizes. The discharge/recharge events occur according to a poisson 82 process, and are assumed to be uniformly distributed in the area of interest. However, a discharge (recharge) event occurring anywhere in the block affects all the sensors located in this block in a similar manner. This introduces spatial correlation across the discharge (recharge) processes of the sensors. The degree of spatial correlation depends on the size of the blocks, the larger the block size the higher the degree of correlation. 3. Event Radius based Event Models: In this model too, the discharge/recharge events occur according to a poisson process, and are uniformly distributed in the area of interest. However, the events only affect the sensors which lie within a certain distance of the area element where the event occurred. This distance is called the event radius, which in general, is larger than the sensor’s coverage radius. Larger event radius leads to more spatial correlation in the discharge/recharge processes, and vice-versa. Note that this model is similar in nature to the block-correlated event model, except that the blocks here are dynamic (event dependent) and are circular in shape. Based upon the above models, we can have any combination of correlation across discharge and/or recharge processes (i.e., IDR, IDCR, CDIR, and CDR). For instance, IDCR model is based on an independent discharge event model and a block-correlated (or an event radius correlated) recharge event model. 5.6.4 Simulation Results The performance of the distributed node activation algorithm described in Sec- tion 5.6.1 is evaluated using simulations for a wide range of system parameters, for different system models. While conducting these simulations, we make the following assumptions: • The sensors do not loose any energy due to periodic polling in their local neighborhood. • When a sensor dies (i.e., gets drained of its energy), all the sensors in its local neighborhood re-evaluate their activation decisions. 83 Note that these assumptions are not restrictive in nature, and relaxing these assumptions would only cause a slight degradation in the system performance. Moreover, the performance difference is expected to appear at all thresholds. Therefore, the performance trends we observe here, with respect to the peak threshold performance and with the various correlation models, would still remain valid. In the representative simulation results presented here, the simulation setup and the parameters used are as follows. A total of N = 56 sensors, each having a circular coverage pattern of radius r = 10 units, are thrown uniformly at random in an area of size 50 × 50. Each square of unit area is a generic area element. With these parameters, the mean coverage of the network (N̄), defined as the average number of sensors covering any area element in the network, is observed to be 7.1. The utility function used is given by U(n) = 1 − (1 − pd )n . The system parameters for the simulations (unless explicitly stated) are γ = 4, K = 100, and pd = 0.1. For the block-correlated event model, the number of blocks is 4. Figure 5.7 depicts the network performance in terms of time average utility achieved in the network for various values of local threshold parameter α. The figures also show the upper K bound on achievable performance (UB) and UB× K+1 . Figure 5.7(a) plots the performance of IDR event model with block-correlated event models. Clearly, we notice the same trend with respect to the performance of various sensor system models as seen in the identical coverage case. Also the performance for all the system models peaks at a value of α close to 1, and the peak performance is very close to the upper bound. Note that the IDR event model also has some amount of spatial correlation, due to which it does not perform close to UB at higher thresholds, as was the case with the IDR model in the identical coverage case. Figure 5.7(b) demonstrates the performance for various system models with block-correlated correlations and small sensor energy bucket size K = 10. We K . observe that the peak performance for all the system models is more than UB× K+1 Figure 5.7(c) plots the performance under similar parameter settings for pd = 0.5. Note that the utility curve in Figure 3.2 depicts an exponential increase in the utility for small values of n when pd = 0.5. Therefore, the network performance peaks at a value of α(∼ 0.6) slightly less than 1 for all sensor system models. 0.16 0.16 0.15 0.15 0.14 0.14 Time Average Utility Time Average Utility 84 0.13 0.12 Upper Bound (UB) UB*K/(K+1) IDR CDIR IDCR CDR 0.11 0.1 0.09 0.08 0 1 0.13 0.12 0.1 0.09 2 3 4 5 6 Local Threshold Parameter (α) 7 0.08 0 8 (a) Large bucket size K 1 2 3 4 5 6 Local Threshold Parameter (α) 7 8 7 8 7 8 (b) Small bucket size K 0.55 0.28 0.26 Upper Bound (UB) UB*K/(K+1) IDR CDIR IDCR CDR 0.45 0.24 Time Average Utility 0.5 Time Average Utility Upper Bound (UB) UB*K/(K+1) IDR CDIR IDCR CDR 0.11 0.4 0.35 0.22 0.2 Upper Bound (UB) UB*K/(K+1) IDR CDIR IDCR CDR 0.18 0.16 0.3 0.14 0.25 0 1 2 3 4 5 6 Local Threshold Parameter (α) 7 0.12 0 8 (c) High detection probability pd 1 2 3 4 5 6 Local Threshold Parameter (α) (d) High recharge rate λ 0.16 0.08 0.15 Time Average Utility Time Average Utility 0.14 0.075 0.07 Upper Bound (UB) UB*K/(K+1) IDR CDR(Ev Rad = 12) CDR (4 Blocks) 0.065 0.13 0.12 0.11 Upper Bound (UB) UB*K/(K+1) CDR (16 Blocks) CDR (4 Blocks) CDR (1 Block) 0.1 0.09 0.06 0 1 2 3 4 5 6 Local Threshold Parameter (α) 7 (e) Different correlation models 8 0.08 0 1 2 3 4 5 6 Local Threshold Parameter (α) (f) Different block sizes Figure 5.7: Network Performance with local thresholds. Threshold of α = 1 achieves near-optimal performance. 85 K Note that the peak performance falls short of UB× K+1 in this case. This is due to the fact that the distributed threshold based algorithm does not satisfy all the invariants of the threshold policy, as discussed in Section 5.6.2. Figure 5.7(d) plots the performance for various system models for γ = 2. Since the recharge rates are higher, the degradation in performance at higher thresholds (α) (in comparison with UB) is less than that in Figure 5.7(a). Next, we study the network performance for both block-correlated and event radius based event models, for different parameter settings. Figure 5.7(e) plots the performance for IDR event model, where each sensor has a smaller coverage radius of 7. The mean coverage of the network (N̄) is around 3.3, leading to a decrease in coverage redundancy at each area element. This leads to the small difference in the peak IDR performance and the UB. Also the peak is achieved at a value of α < 1. The figure also plots the performance for block-correlated CDR model with 4 blocks, and event radius based CDR model with event radius = 12. The amount of correlation introduced with 4 blocks is more than that with an event radius of 12, thus causing the latter system model to perform better than the former at higher thresholds. Figure 5.7(f) plots the performance of the block-correlated CDR models with blocks of sizes 1, 4 and 16. As the number of blocks decreases, the size of each block increases which leads to an increase in the degree of spatial correlation. Therefore, the figure demonstrates that the performance of threshold policies degrade as the degree of spatial correlation increases. This performance drop is particularly significant at higher values of the threshold parameter α. Note that in all cases, the peak performance is very close to the upper bound. Thus the distributed threshold based activation algorithm achieves near-optimal performance even in a general network scenario. We observe that spatial correlation worsens system performance, particularly at higher values of α. In general, values of α close to 1 lead to the peak performance for all the sensor system models. 86 5.7 Summary In this chapter, we have considered a system of partially-rechargeable sensors, and addressed the question of how sensors should be activated dynamically with the objective of maximizing a generalized global performance metric. For the case where the recharge rate is no less than the discharge rate of active sensor nodes, a simple aggressive activation policy achieves optimal performance. However, the performance of such policies degrade considerably as the ratio of the recharge rate to the discharge rate decreases. For the case where the sensors have identical coverages, the energy balancing threshold activation policy achieves asymptotically optimal performance w.r.t. the sensors’ energy bucket size K, even in the presence of spatial correlation in the discharge/recharge processes across the sensor nodes. The optimal threshold policy can be implemented in a distributed manner in the general network scenario, where it is observed to achieve near-optimal performance as well. CHAPTER 6 Rechargeable Sensor Activation under Temporally Correlated Events In this chapter, we consider the node activation question for a single rechargeable sensor node operating in a scenario where the events of interest exhibit some degree of temporal correlation across their occurrences. The key optimization question in such systems is − how the sensor should be activated in time so that the number of interesting events detected is maximized under the typical slow rate of recharge of the sensor. The recharge-discharge dynamics of the rechargeable sensor node, along with temporal correlations in the event occurrences makes the optimal sensor activation question very challenging. In addition, tiny and low-cost nature of the sensor and its minimal processing capabilities creates the need to develop simple, but efficient algorithms for its operations. In this chapter, we characterize the structure of optimal rechargeable sensor activation policies under temporally correlated event occurrences, and develop simple, efficient, near-optimal activation policies, which are easily implementable in practice. The chapter is organized as follows. Section 6.1 formulates the problem in terms of system observability and outlines the solution approaches. Section 6.2 considers the case with complete system observability, while Section 6.3 considers partial observability. Section 6.4 briefly discusses the case with no temporal correlations. We summarize our results in Section 6.5. 6.1 Problem Formulation We model a rechargeable sensor as an energy bucket of size K quanta, as in Chapter 5. We assume a discrete time model, where in each time slot, a recharge event occurs with a probability q and charges the sensor with a constant charge of c quanta. Any charge in excess of K is discarded. The sensor in active state expends a charge of δ1 quanta (operational cost) during each time slot it is active, irrespective of occurrence of an event of interest during the time slot. An event of 87 88 į1+į2 sensor activated discharge - On period K qc quantum arrivals (sensor recharge) į1 discharge - Off period activation policy sensor not activated (no discharge) Figure 6.1: Energy discharge-recharge model of the sensor interest occurs randomly in a time slot and discharges the sensor (if active) by an additional charge of δ2 quanta (detection cost). We assume δ1 ≥ c, and δ2 ≥ δ1 . Let β = δ2 . δ1 A sensor is considered available for activation as long as it has sufficient energy (≥ δ1 + δ2 ) to provide coverage for at least one time slot. Figure 6.1 depicts the energy model for a typical rechargeable sensor, where the recharge rate depends on q, c and the discharge rate for the active sensor depends on δ1 , δ2 and on the state of event occurrence process. The extent of temporal correlation, in events that the sensor is expected to deoff tect, is specified using correlation probabilities pon c and pc , such that 1 2 off < pon c , pc < 1. If an interesting event occured during time slot t, then in the next time slot (t+1) a similar event occurs with probability pon c , while no event occurs with probability 1 − pon c . Similarly, if no event occurred during the current time slot, no event occurs in the next time slot with probability poff c . The event occurrence process comprises of an alternating sequence of periods where events occur (On period) and do not occur (Off period). In practice, the Off periods are expected to be significantly on longer than the On periods, which implies poff c ≥ pc , an assumption that we make in our analysis. The sensor operating under an activation policy Π takes activation decisions in each time slot depending upon (a) the current energy level of the sensor, and (b) the knowledge about the state of event occurrence in the system. Let Eo (T ) denote the total number of events that occur in the sensor’s sensing region during time interval [0 . . . T ]. Let Ed (T ) denote the total number of events detected during this interval 89 Event Process 1.5 On 1 On 0.5 0 0 Off 10 Off 20 30 Time slots (t) 40 50 off Figure 6.2: Event occurrence process (pon = 0.8). Here 0 and 1 c = pc represent the Off and On periods respectively. by sensor operating under sensor activation policy Π. The quality of coverage (or b the event detection probability) in the region, denoted U(Π), is measured in terms of the time-average fraction of events detected, i.e., b (Π) = lim Ed (T ) . U T →∞ Eo (T ) (6.1) In this chapter, we address the following important question : How should the rechargeable sensor node be activated so as to maximize the event detection probab ? In other words, our goal is to choose the activation policy Π such that bility U b (Π) is maximized. U 6.1.1 On-Off Periods Figure 6.2 depicts the typical behavior of the event process as a sequence of On and Off periods. Consider a time slot t such that an event occurred at time slot t − 1, but no event occurred in time slot t. Let N denote the random variable representing the number of time slots (including t) after which the event occurs i−1 again. Then, Pr(N = i) = poff 1 − poff , ∀i ≥ 1. Therefore, c c E[N] = (1 − poff c ) ∞ X i=1 i−1 i(poff = c ) 1 . 1 − poff c (6.2) 90 Thus, the expected length of an Off period is given by length of an On period is given by 1 . 1−pon c 1 . 1−poff c Similarly, the expected Using markov chain analysis, as T → ∞, the steady-state probability of event occurrence equals π on = 1−poff c off 2−pon c −pc (π off = on off 1 − π on ). Note that, poff ≥ π on . c ≥ pc implies π 6.1.2 System Observability We consider two different system observability scenarios, namely Completely Observable and Partially Observable. In the completely observable system, the sensor has perfect information about the event occurrence state in the system at all times. More specifically, the sensor is able to observe the system state even when it is inactive. In practice, once the sensor deactivates itself, it may not be able to observe the state of event occurrence in the system during the time slots in which it is inactive, leading to partial observability. Therefore, the sensor would be required to take activation decisions under imperfect information about the system state. The structure of the optimal policy with complete information is similar with that corresponding to incomplete information, and provides the important insights that help us develop simple, near-optimal activation policies for partially observable systems. Our approach is as follows. We formulate the sensor activation under complete observability as a Markov Decision Process [57] and solve for the optimal policy. b (Π) We derive a tight bound on achievable performance measured in terms of U (Lemma 13) and show that the optimal policy achieves this bound for large energy bucket size K (Theorem 14). We then provide a simple and efficient activation policy which achieves asymptotically optimal performance w.r.t. K (Corollary 15). Next, we formulate the sensor activation problem under partial observability as a Partially Observable Markov Decision Process (POMDP) [16]. The challenge here is to transform this problem into an equivalent, completely observable MDP with the same optimal reward and actions as the POMDP [27]. We provide a near-optimal solution to optimality equations (Theorem 18), and characterize some properties of near-optimal activation policies (Lemma 19). We empirically find the structure of the optimal policy for the POMDP, and focus on developing simple and efficient 91 near-optimal policies with performance guarantees (Theorem 21). 6.1.3 Activation Policies We consider the following categories of sensor activation algorithms: 6.1.3.1 Aggressive Wakeup (AW) policy Operating under AW policy, the sensor switches itself on whenever possible. In other words, the sensor activates itself in a time slot if its current charge level is ≥ δ1 + δ2 . Otherwise the sensor moves to the dead state. 6.1.3.2 Correlation-dependent Wakeup (CW) policy The CW policy satisfies the following criteria: (i) An active sensor with sufficient energy remains active if an event occurred during the previous time slot; (ii) An active sensor goes to sleep for an arbitrary duration of time if no event occurred during the previous time slot. At the end of the sleep duration, the sensor activates itself to poll the system state. If at any time the sensor decides to activate itself but does not have sufficient energy to detect an event i.e., its charge level < (δ1 + δ2 ), then the sensor moves to the dead state. (Note that the dead state is different from the sleep (or inactive) state with the latter corresponding to deliberate deactivation.) The best CW policy corresponds to choosing the sleep duration appropriately, given the system parameters and the temporal correlation matrix. 6.2 Activation under Perfect State Information Let the system state at time t be Xt = (Lt , Et ), where Lt ∈ {0 . . . K} represents the energy level of the sensor at (the end of) time t, and Et ∈ {0, 1} equals one if an event occurred at time t; zero otherwise. Each time slot is a decision epoch. The action taken at (the end of) time t is denoted by ut ∈ {0, 1}, where ut = 0 (ut = 1) corresponds to sensor deactivation (activation) at time t + 1. Since the next state of the system depends only upon the current state and the action taken, the system constitutes a Markov Decision Process (MDP). The sensor gains a reward of one if ut = 1, and Et+1 = 1. The reward function r(Xt , ut ) (using (2.1.1) in [57]) is given 92 by, r(Xt , ut ) = on pc if ut = 1, Lt ≥ δ1 + δ2 , and Et = 1, 1 − poff if ut = 1, Lt ≥ δ1 + δ2 , and Et = 0, c (6.3) 0 otherwise. Let wt (vt ) be the random variable denoting the amount of charge gained (lost) by the sensor in time slot t + 1. Then, wt = c w.p. q; 0 otherwise. And, on off δ + δ w.p. u E p + (1 − E ) 1 − p , 1 2 t t t c c off vt = δ1 w.p. ut Et (1 − pon , c ) + (1 − Et )pc 0 w.p. (1 − ut ). (6.4) The system state at time t + 1 is given by (Lt+1 , Et+1 ), where Lt+1 = min((Lt + wt − off vt ), K) and Et+1 = 1 w.p. Et pon c + (1 − Et )(1 − pc ); 0 otherwise. 6.2.1 Upper Bound on Achievable Performance Let T1 denote the number of time slots in which the sensor was active, oper- ating under the stationary optimal activation policy Πopt over a period of T time slots. As T → ∞, the total number of events that occur during time [0 . . . T ], satisfies Eo (T ) T → π on = 1−poff c off . 2−pon c −pc Let Ps denote the steady-state success probability of detecting an event under policy Πopt . Precisely, Ps is the probability that an event occurs in time t conditioned that the sensor was active at time t. The total number of events detected by the sensor satisfies Ed (T ) = T1 Ps . Since the event occurrence process is independent of sensor activation and depends only upon its state in the previous time slot, the success probability Ps can be expressed as, Ps = Pr[Et+1 |ut] = 1 Pr[Et+1 , ut ] = [Pr[Et+1 , ut |Et ]Pr[Et ] + Pr[Et+1 , ut |Etc ]Pr[Etc ]] Pr[ut ] Pr[ut ] 1 [Pr[Et+1 |ut , Et ]Pr[ut |Et ]Pr[Et ] + Pr[Et+1 |ut , Etc ]Pr[ut |Etc ]Pr[Etc ]] Pr[ut ] 1 c on off c = [pon Pr[Et , ut ] + (1 − poff c )Pr[Et , ut ]] = pc Pr[Et |ut ] + (1 − pc )Pr[Et |ut ] Pr[ut ] c off on on = pon c Pr[Et |ut ] + (1 − pc )[1 − Pr[Et |ut ]] ≤ pc Pr[Et |ut ] + (1 − pc )[1 − Pr[Et |ut ]] = ≤ pon c . (6.5) 93 (Note here that Et represents the event Et = 1, and Etc represents the event off Et = 0; Similarly for ut .) The first inequality above holds since pon c ≤ pc . The second inequality holds for all values of Pr[Et |ut ], since the weighted average of pon c 1 on on and 1 − pon c cannot exceed pc for pc > 2 . Assuming some initial charge L(0), the charge level of the sensor at time T (assuming that the sensor did not loose any charge due to its energy bucket being full when a charge quantum arrived) is given by, E[L(T )] = L(0) + T1 (qc − δ1 − Ps δ2 ) + (T − T1 )qc. Now, since the sensor energy bucket holds non-zero charge, we have, E[L(T )] ≥ 0. Simplifying, dividing by T, and taking the limit T → ∞, we have, T1 qc ≤ . T →∞ T δ1 + Ps δ2 lim (6.6) Note here that if any charge was lost due to the sensor energy bucket being full, the fraction on the r.h.s. in (6.6) would decrease. From (6.1), the performance of policy Πopt is given by, b (Π ) = lim Ed (T ) = U T →∞ Eo (T ) opt 1 π on lim T →∞ T1 Ps T ≤ 1 π on Ps qc δ1 + Ps δ2 . (6.7) Let UB denote the upper bound (r.h.s. in (6.7)) to the performance of Πopt . We have, dUB = dPs 1 π on qcδ1 (δ1 + Ps δ2 )2 > 0. (6.8) Thus UB is a non-decreasing function of Ps . Now, from (6.5), we have the following result. b (Πopt ) ≤ Lemma 13 For any stationary activation policy Πopt , U 6.2.2 1 π on pon c qc δ1 +pon c δ2 . Optimal Policy Since the induced Markov Chain is unichain, from Theorem 8.4.5 in [57], there exists a deterministic, markov, stationary (ΠM D ) optimal policy which also leads to a limiting (steady-state) transition probability matrix. The performance criteria considered is the average expected reward criteria. The optimality equations are 94 given by [7], (K,1) λ∗ + h∗ (X) = max r(X, u) + u∈{0,1} X X ′ =(0,0) pXX ′ (u)h∗ (X ′ ) , ∀X ∈ {(0, 0), . . . , (K, 1)}, (6.9) where pXX ′ (u) denotes the probability of transition from state X to state X ′ when action u is taken. The following properties can be shown to hold using the arguments in [7]: (a) h∗ ((L, 1)) is monotonically non-decreasing in L, (b) h∗ ((L, 0)) is monotonically non-decreasing in L, and (c) h∗ ((L, 1)) > h∗ ((L, 0)) ∀L ∈ {0 . . . K}. Definition 1 Let h′ and λ′ satisfy (6.9) within an error of ǫ. Then the policy Π corresponding to the actions u′ obtained using h′ and λ′ in (6.9), is an ǫ-optimal policy. We shall show that for K sufficiently large, and δ1 ≥ c, the solution to the optimality equations within an error of o K1 are given by, h∗ ((L, 1)) = αL, h∗ ((L, 0)) = αL, and λ∗ = αqc, pon c , and δ1 +pon c δ2 on qc c λ∗ → δ1p+p on δ c 2 1 K (6.10) where α = o → 0, as K → ∞. Thus the optimal average reward per stage as K → ∞. We divide the state-space into six categories, and show that the optimality equations hold for the above values of h∗ , λ∗ in all the scenarios, within an error of o K1 . Case I : (L, 0), 0 ≤ L < δ1 + δ2 : The l.h.s. of the optimality equation equals h∗ ((L, 0)) + λ∗ . Only the deactivate action is feasible in this state. The r.h.s. for deactivate action is given by, ∗ off ∗ r.h.s.(0) = r((L, 0), 0) + poff c (1 − q)h ((L, 0)) + pc qh ((L + c, 0)) ∗ ∗ off + (1 − poff c )qh ((L + c, 1)) + (1 − pc )(1 − q)h ((L, 1)) off off = 0 + poff c (1 − q)αL + pc qα(L + c) + (1 − pc )qα(L + c) +(1 − poff c )(1 − q)αL = αL + αqc = h∗ ((L, 0)) + λ∗ . 95 Case II : (L, 1), 0 ≤ L < δ1 + δ2 : The deactivate action is the only feasible action and similar to Case I, the optimality equation is satisfied within an error of o K1 . Case III : (L, 0), δ1 + δ2 ≤ L ≤ K − c: We show that the optimality equation is satisfied for the deactivate action, and that the maximum on the r.h.s. in (6.9) is achieved for the deactivate action. The former follows similar to Case I. The r.h.s. for activate action is given by, ∗ off ∗ r.h.s.(1) = r((L, 0), 1) + poff c (1 − q)h ((L − δ1 , 0)) + pc qh ((L + c − δ1 , 0)) ∗ off ∗ +(1 − poff c )qh ((L + c − δ1 − δ2 , 1)) + (1 − pc )(1 − q)h ((L − δ1 − δ2 , 1)) off off = 1 − poff c + pc (1 − q)α(L − δ1 ) + pc qα(L + c − δ1 ) off +(1 − poff c )qα(L + c − δ1 − δ2 ) + (1 − pc )(1 − q)α(L − δ1 − δ2 ) off off = αL + αqc + 1 − poff c − pc αδ1 − (1 − pc )α(δ1 + δ2 ) off = αL + αqc + 1 − poff c − α(δ1 + (1 − pc )δ2 ) off pon c (δ1 + (1 − pc )δ2 ) = αL + αqc + 1 − poff − c δ1 + pon c δ2 off on (1 − pc − pc )δ1 = αL + αqc + < αL + αqc = h∗ ((L, 0)) + λ∗ . δ1 + pon δ c 2 on The inequality above follows since poff c + pc > 1. Case IV : (L, 1), δ1 + δ2 ≤ L ≤ K − c: We show that the optimality equation is satisfied for the activate action. The r.h.s. in (6.9) for activate action is given by, ∗ on ∗ r.h.s.(1) = r((L, 1), 1) + pon c (1 − q)h ((L − δ1 − δ2 , 1)) + pc qh ((L + c − δ1 − δ2 , 1)) ∗ on ∗ +(1 − pon c )qh ((L + c − δ1 , 0)) + (1 − pc )(1 − q)h ((L − δ1 , 0)) on on = pon c + pc (1 − q)α(L − δ1 − δ2 ) + pc qα(L + c − δ1 − δ2 ) on +(1 − pon c )qα(L + c − δ1 ) + (1 − pc )(1 − q)α(L − δ1 ) on on = pon c + αL + αqc − pc α(δ1 + δ2 ) − (1 − pc )αδ1 on = αL + αqc + pon c − α(δ1 + pc δ2 ) = αL + αqc = h∗ ((L, 0)) + λ∗ . 96 Table 6.1: ǫ-Optimal actions for perfect state information ǫ ∼ o L < δ1 + δ2 δ1 + δ2 ≤ L ≤ K − c K − c < L ≤ K E=0 0 0 0/1(depends)∗ E=1 0 0,1† 1 1 K ∗ Activation ǫ-optimal if recharge rate is sufficiently high. Both the activate and deactivate actions are ǫ-optimal in this case. † Note that the optimality equation is also satisfied for the deactivate action (using analysis similar to Case I), and hence both the actions are optimal in this case. An intuitive explanantion of this behavior lies in the choice of optimal actions in other cases. We shall observe that activation is the optimal action in Case VI, when the energy level of the sensor is close to K and the event occurrence process is in the On period. In addition, the optimal action is to deactivate in Case III, when the event occurrence is Off and the sensor has energy < K − c. The above two facts imply that for sufficiently large K, the sensor does not loose long term rewards by deactivating in this case (even during the On period), as long as it does activate when its energy level is close to K. Case V : (L, 0), K − c < L ≤ K: We evaluate the r.h.s. in (6.9) for both the activate and deactivate actions. ∗ off ∗ r.h.s.(0) = r((L, 0), 0) + poff c (1 − q)h ((L, 0)) + pc qh ((K, 0)) ∗ off ∗ +(1 − poff c )qh ((K, 1)) + (1 − pc )(1 − q)h ((L, 1)) off off off = 0 + poff c (1 − q)αL + pc qαK + (1 − pc )qαK + (1 − pc )(1 − q)αL = αL + αq(K − L) = αL + αqc + αq(K − L − c). Similar to Case III, we get, r.h.s.(1) = αL + αqc + off (1−pon c −pc )δ1 . δ1 +pon c δ2 The optimality equation is given by, off (1 − pon c − pc )δ1 . λ + h ((L, 0)) = αL + αqc + max αq(K − L − c), δ1 + pon c δ2 ∗ ∗ 97 Substituting λ∗ , h∗ from (6.10), and dividing both sides by K, we get, off αqc αL αL αqc αq(K − L − c) (1 − pon c − pc )δ1 . + = + + max , K K K K K K(δ1 + pon c δ2 ) Since K − c < L ≤ K, as K becomes large, αL K → α and all other terms → 0. The optimality equation is satisfied within an error of o K1 . However, the optimal action in this state depends on the values of the two negative terms above. For instance, when L = K, the activate action is optimal if qc > off δ1 (pon c +pc −1) ; pon c otherwise deactivation is optimal. In other words, activation is optimal if the recharge rate is sufficiently large. Case VI : (L, 1), K − c < L ≤ K: Since δ1 ≥ c, using analysis similar to Case IV, it follows that the optimality equation is satisfied for the activate action. We shall show that activate action achieves the maximum on the r.h.s. of (6.9). The r.h.s. for the deactivate action is given by, ∗ on ∗ on ∗ r.h.s.(0) = r((L, 1), 0) + pon c (1 − q)h ((L, 1)) + pc qh ((K, 1)) + (1 − pc )qh ((K, 0)) ∗ +(1 − pon c )(1 − q)h ((L, 0)) on on on = 0 + pon c (1 − q)αL + pc qαK + (1 − pc )qαK + (1 − pc )(1 − q)αL = αL + αq(K − L) < αL + αqc = h∗ ((L, 1)) + λ∗ . The inequality above follows since K − L < c. From the cases considered above, we obtain the following result. Theorem 14 The values of λ∗ , and h∗ given by h∗ ((L, 0)) = h∗ ((L, 1)) = αL, λ∗ = αqc where α = of ǫ ∼ o K1 . pon c , δ1 +pon c δ2 satisfy the optimality equations for the MDP within an error The ǫ−optimal actions are depicted in Table 6.1. As K → ∞, the average reward per time slot approaches λ∗ , and the fraction of events detected by the sensor on qc ∗ c , which is also the maximum achievable is given by, limT →∞ πλonTT = π1on δ1p+p on δ 2 c performance from Lemma 13. 98 Table 6.2: Optimal actions for sample cases Low recharge (q = 0.1) High recharge (q = 0.5) Activate Deactivate Activate Deactivate (L, 1) : L ≥ 2 (L, 0) (L, 1) : L ≥ 2 (L, 0) : L < 10 (L, 1) : L < 2 (L, 0) : L = 10 (L, 1) : L < 2 6.2.3 Optimal Policy evaluation using Value Iteration Theorem 14 provides us with the class of activation policies which are ǫ−optimal for all values of system parameters, where ǫ ∼ o K1 . We now focus on the unique optimal policy for given set of system parameters. The relative value iteration [7] is used to solve for the optimality equations in (6.9). For system parameters, off δ1 = δ2 = c = 1, pon c = 0.6, pc = 0.8, q = 0.1, K = 10, the representation of the optimal policy is depicted in Table 6.2. We observe that the sensor operating under the optimal policy, always tries to activate itself during the On period, and deactivates itself during the Off period. An increase in recharge rate causes slight change in the structure of the optimal policy, particularly when the sensor has charge level close to K, as depicted in Table 6.2. In effect, if the recharge rate is high, the sensor tries to activate itself during the Off period as well, as long as its energy level is close to K. 6.2.4 Activation Algorithm We observe that the optimal policy is in general sensitive to system parameters. Also it requires the sensor to keep track of its current energy level at all times. However, the sensor may not always have an accurate estimate of the system parameters and/or may not be able to continuously keep track of its current energy level due to the computational overhead or other practical considerations. Therefore, there is a need to develop simpler but efficient algorithms for sensor activation. We now show that for large K, near-optimal solutions can be obtained in a simple, closed form. The class of ǫ−optimal policies given by Theorem 14 provides us with an activation algorithm which is deterministic, Markov (memoryless), does not require the sensor to keep track of its current energy level while making activation decisions, and results in close-to-optimal performance. The sensor is only required to know if it has 99 sufficient energy for activation, i.e., whether its energy level ≥ δ1 + δ2 . The following simple algorithm achieves asymptotically optimal performance with respect to the sensor energy bucket size K. Corollary 15 The sensor activation policy Π∗ with actions u∗ defined by 1 if L ≥ δ + δ , and E = 1, 1 2 u∗ = 0 otherwise, is an ǫ-optimal policy where ǫ ∼ o 1 K . In words, the decision u∗ is to activate the sensor in a time slot iff an event occurred in the previous time slot, and the sensor has sufficient energy to operate in the current time slot. 6.3 Activation under Imperfect State Information For partially observable systems, we formulate the sensor activation problem as a POMDP [16], transform it to an equivalent MDP using well-known techniques [27, 7], and elaborate on the structure of the optimal policy in Section 6.3.1. We then develop an efficient and easily implementable activation policy in Section 6.3.2, and show that it achieves near-optimal performance when K and β are sufficiently large. We discuss the performance of AWP policies in Section 6.3.3, and present simulation results in Section 6.3.4. 6.3.1 Structure of Optimal Policy We present the equivalent MDP formulation of the POMDP in Section 6.3.1.1. We formulate an approximate solution to the optimality equations and derive useful properties of ǫ-optimal policies in Section 6.3.1.2. We then numerically compute the optimal policy for some sample scenarios using relative value iteration, and discuss the observed structure of optimal policy in Section 6.3.1.3. 100 6.3.1.1 MDP Formulation The state Xt of the sensor system at time t is represented as (Lt , Et ), as in Section 6.2. The observation Yt made at time t depends on the state at time t and the action taken at time t − 1. If the action ut−1 was to activate, the observation matches the state and equals (Lt , Et ). However, if the action taken was to deactivate, the state of the event process at time t is not known; in this case the observation equals (Lt , φ), where φ represents that Et is unknown. Note that the current energy level Lt is assumed to be always observable. The state-space X is given by, X = {(0, 0), (1, 0), ..., (K, 0), (0, 1), (1, 1), ..., (K, 1)}, where |X | = 2(K + 1). The observation space Y is given by, Y = X ∪ {(0, φ), (1, φ), ..., (K, φ)}, with |Y| = 3(K+1). The set of actions U = {0, 1}. Let qx,y (u) denote the probability distribution of the observation (Yt+1 = y) at time t + 1, conditioned on the current state (Xt+1 = x) and the action taken at time t (Ut = u) [27]. Thus, qx,y (u) = Pr[Yt+1 = y|Xt+1 = x, Ut = u]. In particular, qx,y (1) = 1 if y = x; 0 otherwise, and qx,y (0) = 1 if y = (x.L, φ); 0 otherwise, where x.L = Lt when x = (Lt , Et ). Since the system state in not completely observable, the optimal action depends on the current and past observations, and on past actions. It has been shown that the POMDP can be formulated as a completely observable MDP [27, 7], with the same finite action set. The solution to the equivalent MDP with complete information provides us with the optimal actions to take (in the POMDP) and with the optimal reward. The state space of the equivalent MDP, denoted ∆, comprises of the space of probability distributions on the original state space, which may lead to a possibly uncountable or infinite state space. However, the structure of the original POMDP, in most cases, allows for the existence of solutions to the average cost (reward) optimality equation [27]. In the case of sensor activation under partial observations too, the structure of the POMDP leads to a countable state space for 101 the equivalent MDP, guaranteeing existence of optimal solution. The state of the equivalent MDP at time t is the information vector Zt ∈ ∆ (of (i) length |X |), whose ith component is given by, Zt = Pr[Xt = i|yt , ..., y1;ut−1 , ..., u0]; i ∈ X . We have, 1′ Zt = 1, since the elements of Zt correspond to mutually exclusive events whose union is the universal set. The state Zt+1 is recursively computable given the transition probability matrices P (u), action taken ut and the observation yt+1 [27], as Zt+1 = X Q̄y (ut )P ′ (ut )Zt I[Yt+1 = y], 1′ Q̄y (ut )P ′ (ut )Zt y∈Y (6.11) where I[A] denotes the indicator function of the event A and the matrices Q̄y (u) = diag{qx,y (u)}. 1′ denotes a row vector with all elements equal to one. The numerator in the recursive relation denotes probability of event Xt+1 = i, Yt+1 = y given past actions and observations and is denoted by T̄ (y, Zt, ut ), while the denominator denotes probability of event Yt+1 = y given past actions and observations and is denoted by V (y, Zt, ut ). The fraction ( VT̄ ) is denoted W (y, Zt, ut). {Zt } forms a completely observable controlled Markov process with state space ∆. The reward associated with the state Z ∈ ∆ and action u ∈ U, is defined as r̄(Z, u) = Z ′ [r(i, u)]i∈X . The optimal reward for the original POMDP is same as that of the equivalent formulated MDP [27]. Let ej denote the unit column vector with all elements equalling zero except the j th element being one. Thus, if ut−1 = 1, Zt = eyt = ext . On the other hand, if ut−1 = 0, then the observation yt = (L, φ), for some L such that 0 ≤ L ≤ K. Given the observation (yt = (L, φ)), the state of the system is either (L, 0) or (L, 1). Thus the state Zt of the equivalent MDP has a maximum of two non-zero components, ′ and is of the form Zt = α1 ej + α2 ej , where α1 + α2 = 1, 0 ≤ α1 , α2 ≤ 1, j = (L, 0), and j ′ = (L, 1). The values of α1 , α2 exhibit an elegant structure, as discussed below. (i) Let F1,0 denote the probability that the event process at time t + i is off, (i) given that it was on at time t. In other words, F1,0 denotes the i-step transition probability of the event process changing states from 1 to 0 in i time slots. Similarly, (i) F0,1 denotes the probability that the event process changed from off to on during (0) (0) the i time slots. We have, F0,1 = F1,0 = 0. The functions F can be expressed in 102 closed form as 1 (i) − F1,0 (i) F0,1 1 (i) F1,0 (i) − F0,1 = pon c 1− 1 − poff c pon c poff c i ∀i > 0. (6.12) The following recursive equations can be shown to hold using (6.12): (i+1) F0,1 (i) (i) (i+1) (i) (i) off = [1 − F0,1 ] 1 − poff + F0,1 pon = [1 − F1,0 ] (1 − pon c c , and F1,0 c ) + F1,0 pc . (6.13) off Since 0 < pon c + pc − 1 < 1, it can be shown that [9] ∀i > 0: (i) F1,0 i i on off off (1 − pon 1 − poff [1 − pon (i) c ) [1 − pc + pc − 1 ] c c + pc − 1 ] = , and F0,1 = . off off 2 − pon 2 − pon c − pc c − pc (6.14) (i) (i) (i) (i) Therefore, limi→∞ F1,0 = π off and limi→∞ F0,1 = π on . Thus, F1,0 and F0,1 are monotonically increasing sequences converging to π off and π on respectively. Note off that for the special case when pon = pc , the function F can also be exc = pc Pi (i−j) (i) (i) i (1 − pc )j pc . Let us represent Zt = pressed as, F0,1 = F1,0 = j=1, odd j (i) (i) (1 − FE,1−E )e(L,E) + FE,1−E e(L,1−E) as Zt = (L, E, i). Lemma 16 The state-space ∆ is countable. Proof: Let Zt = (L′ , E ′ , i) for some 0 ≤ L′ ≤ K, E ′ ∈ {0, 1} and integer i ≥ 0. • Case ut = 1: Let Xt+1 = (L, E). Then yt+1 = Xt+1 = (L, E). We have, Zt+1 = (L, E, 0). • Case ut = 0: Let Xt+1 = (L, E). Then yt+1 = (L, φ). Let us consider the case (i) ′ (i) ′ where E ′ = 0. Expanding we have, Zt = (1 − F0,1 )e(L ,0) + F0,1 e(L ,1) . Using (6.13), we have, Zt+1 i h i h (i) (i) (i) (L,0) on (i) off off on + pc F0,1 + 1 − pc (1 − F0,1 ) e(L,1) = pc (1 − F0,1 ) + (1 − pc ) F0,1 e (i+1) (i+1) = [1 − F0,1 ]e(L,0) + F0,1 e(L,1) = (L, 0, i + 1) = (L, E ′ , i + 1). Similarly, for the case E ′ = 1, Zt+1 = (L, E ′ , i + 1). 103 Thus, Zt+1 is completely described using Zt , ut and yt+1 . Assuming an initial state (and observation) of (K, 1), Z0 = (K, 1, 0) and Zt is of the form (L, E, i), ∀t > 0. Since L, E, and i are individually countable, and since all the vectors Z ∈ ∆ are of the form (L, E, i), we have the result. Thus α1 , α2 take on values only from the set S, where (i) (i) S = {FE,1−E , 1 − FE,1−E }, i ≥ 0, i integer, E ∈ {0, 1}. (6.15) Lemma 17 The reward function of the equivalent MDP r̄(Z, u), ∀Z ∈ ∆, u ∈ U belongs to the set S, given by (6.15). Proof: (0) Let Z = (L, E, i). For u = 0, r̄(Z, 0) = 0 = F1,0 . For u = 1 and L < δ1 + δ2 , (0) r̄(Z, 1) = 0 = F1,0 . For u = 1 and L ≥ δ1 + δ2 , we have the following cases: (i) (i) (i+1) • E = 0: r̄(Z, 1) = [1 − F0,1 ] 1 − poff + F0,1 pon c c = F0,1 . (i) (i) (i+1) off • E = 1: r̄(Z, 1) = [1 − F1,0 ]pon = 1 − F1,0 . c + F1,0 1 − pc The above equalities follow from the definition of r̄ and (6.13). In reality, the state Z = (L, E, i) represents the following: (a) The sensor has been inactive for the last i time slots, (b) The state of the event process observed when the sensor was last active is E, and (c) The current charge level of the sensor equals L. The POMDP is transformed to an equivalent completely observable MDP with state-space ∆. The optimality equations for this MDP are given by [27]: ∗ ∗ " Γ + h (Z) = max r̄(Z, u) + u∈U X ∗ # V (y, Z, u)h (W (y, Z, u)) , ∀Z ∈ ∆. y∈Y (6.16) The following properties can be shown to hold using the arguments in [7]: (a) h∗ ((L, 1, i)) is monotonically non-decreasing in L, (b) h∗ ((L, 0, i)) is monotonically non-decreasing in L, and (c) h∗ ((L, 1, i)) > h∗ ((L, 0, i)) ∀L, i. 104 6.3.1.2 Properties of ǫ-optimal policies We show that the solution to the optimality equations within an error of o are given by, h∗ ((L, 1, i)) = αL, h∗ ((L, 0, i)) = αL, and Γ∗ = αqc, ∀(L, E, i) ∈ ∆, where α = ” “ 1 1−pon δ2 +δ1 1+ πonc and ǫ ∼ o 1 β 1 β (6.17) is a small error fraction difficult to char- acterize analytically (recall β = δδ12 ). Thus the optimal average reward per stage “ qc ” . As T → ∞, Eo (T ) → π on within an error of o β1 is given by Γ∗ ≤ 1−pon T c δ2 +δ1 1+ π on and the fraction of events detected by the sensor is bounded as, Γ∗ qc Γ∗ T . lim on = on ≤ on T →∞ π T π π (δ2 + δ1 ) + δ1 (1 − pon c ) (6.18) We divide the state-space into six categories as in Section 6.2.2, and show that the optimality equations hold for the above values of h∗ , Γ∗ in all the scenarios within an error of o β1 . Case I : (L, 0, i), 0 ≤ L < δ1 + δ2 , i ≥ 0: The l.h.s. of the optimality equation equals h∗ ((L, 0, i)) + Γ∗ . Only the deactivate action is feasible in this state. r.h.s.(0) = r̄((L, 0, i), 0) + (1 − q)h∗ ((L, 0, i + 1)) + qh∗ ((L + c, 0, i + 1)) = 0 + (1 − q)αL + qα(L + c) = αL + αqc = h∗ ((L, 0, i)) + Γ∗ . Case II : (L, 1, i), 0 ≤ L < δ1 + δ2 , i ≥ 0: The deactivate action is the only feasible action and similar to Case I, the optimality equation is satisfied within an error of o β1 . Case III : (L, 0, i), δ1 + δ2 ≤ L ≤ K − c, i ≥ 0: Similar to Case I, the optimality (i+1) equation is satisfied for the deactivate action. Note that r̄((L, 0, i), 1) = F0,1 . The 105 r.h.s. for activate action is given by, (i+1) (i+1) (i+1) r.h.s.(1) = F0,1 + F0,1 (1 − q)h∗ ((L − δ1 − δ2 , 1, 0)) + F0,1 qh∗ ((L + c − δ1 − δ2 , 1, 0)) (i+1) (i+1) + 1 − F0,1 qh∗ ((L + c − δ1 , 0, 0)) + 1 − F0,1 (1 − q)h∗ ((L − δ1 , 0, 0)) (i+1) = αL + αqc + F0,1 (1 − αδ2 ) − αδ1 . (i+1) Since l.h.s. equals αL + αqc, the error fraction ǫ equals F0,1 (1 − αδ2 ) − αδ1 . Since αδ1 and (1 − αδ2 ) are of order o β1 , ǫ ∼ o β1 , i.e. ǫ → 0 as β becomes large. (i+1) Assuming sufficiently large β, activation is optimal if F0,1 ≥ αδ1 1−αδ2 = π on . π on +1−pon c Case IV : (L, 1, i), δ1 + δ2 ≤ L ≤ K − c, i ≥ 0: Similar to Case I, the optimality (i+1) equation is satisfied for the deactivate action. Note that r̄((L, 1, i), 1) = 1 − F1,0 . The r.h.s. for activate action is given by, (i+1) r.h.s.(1) = 1 − + 1 − F1,0 (1 − q)h∗ ((L − δ1 − δ2 , 1, 0)) (i+1) qh∗ ((L + c − δ1 − δ2 , 1, 0)) + 1 − F1,0 (i+1) F1,0 (i+1) (i+1) +F1,0 qh∗ ((L + c − δ1 , 0, 0)) + F1,0 )(1 − q)h∗ ((L − δ1 , 0, 0)) (i+1) (1 − αδ2 ) − αδ1 . = αL + αqc + 1 − F1,0 1 . Assuming β 1−pon c . π on +1−pon c Using arguments similar to Case III, the error ǫ ∼ o large β, activation is optimal if (i+1) F1,0 ≤ 1−αδ2 −αδ1 1−αδ2 = sufficiently Case V : (L, 0, i), K − c < L ≤ K, i ≥ 0: We evaluate the r.h.s. in (6.16) for both the activate and deactivate actions. Simplifying, we get, r.h.s.(0) = αL+αqc+ (i+1) αq(K − L − c) and r.h.s.(1) = αL + αqc + F0,1 (1 − αδ2 ) − αδ1 . The optimality equation is given by, h i (i+1) Γ∗ + h∗ ((L, 0, i)) = αL + αqc + max αq(K − L − c), F0,1 (1 − αδ2 ) − αδ1 . Similar to Case III, the equation is satisified within an error of o (i+1) optimal if F0,1 ≥ αδ1 +αq(K−L−c) 1−αδ2 = π on π on +1−pon c − qπ on (L+c−K) . δ1 (π on +1−pon c ) 1 . Activation is β (i) Note that since F0,1 is an increasing function of i, larger the recharge rate q, earlier the activation in this state. 106 Case VI : (L, 1, i), K − c < L ≤ K, i ≥ 0: We evaluate the r.h.s. in (6.16) for both the activate and deactivate actions. Simplifying, we get, r.h.s.(0) = αL + (i+1) (1 − αδ2 ) − αδ1 . The αqc + αq(K − L − c) and r.h.s.(1) = αL + αqc + 1 − F1,0 optimality equation is given by, h i (i+1) Γ + h ((L, 1, i)) = αL + αqc + max αq(K − L − c), 1 − F1,0 (1 − αδ2 ) − αδ1 . ∗ ∗ Similar to Case IV, the equation is satisified within an error of o (i+1) optimal if F1,0 ≤ 1−α(δ1 +δ2 )+αq(L+c−K) 1−αδ2 = 1−pon c π on +1−pon c + 1 . Activation is β qπ on (L+c−K) . δ1 (π on +1−pon c ) Theorem 18 The values of Γ∗ , and h∗ given by h∗ ((L, 0, i)) = h∗ ((L, 1, i)) = αL ∀i ≥ 0, Γ∗ = αqc where α = “ 1 ” 1−pon , δ2 +δ1 1+ πonc satisfy the optimality equations for the POMDP within an error of ǫ ∼ o β1 . The ǫ−optimal actions for each state are as given in the cases above. Some of the properties of the optimal policy are characterized in the following result. Lemma 19 The ǫ-optimal policy whose solution is given by (6.17) satisfies the following properties, ∀L ≥ δ1 + δ2 : (i) µ∗ ((L, 1, 0)) = 1 and µ∗ ((L, 0, 0)) = 0 (for L ≤ K − c), (ii) µ∗ ((L, 1, i)) = 1 ⇒ µ∗ ((L, 1, i − 1)) = 1, ∀i ≥ 1, and (iii) µ∗ ((L, 0, i)) = 1 ⇒ µ∗ ((L, 0, i + 1)) = 1, ∀i ≥ 0. Proof: (1) 1−pon c . π on +1−pon c (1) have F0,1 = on From Cases IV, VI above, since π on ≤ pon c , we have F1,0 = 1 − pc ≤ Therefore, µ∗ ((L, 1, 0)) = 1. Similarly, since 1 − poff c < π on . π on +1−pon c 1 2 off < pon < 1, we c , pc Therefore, from Case III above, ∀L : δ1 + δ2 ≤ L ≤ K − c, (i) (i) µ∗ ((L, 0, 0)) = 0. Properties (ii) and (iii) follow since the functions F0,1 and F1,0 are non-decreasing in i from (6.14). 107 6.3.1.3 Optimal Policy evaluation using Value Iteration We use relative value iteration [7] to numerically solve for the optimality equations in (6.16) for some sample cases. (We set the maximum value for i to be a large number denoted IM AX and appropriately define the transition probabilities for the vectors p = (L, E, IM AX ).) From the numerical results, we have the following observations. We observe that (a) h∗ ((L, 0, i)) is non-decreasing in i, and (b) h∗ ((L, 1, i)) is non-increasing in i. However, it seems difficult to characterize the function h∗ in exact closed form. We observe that the optimal policy evaluated using value iteration satisfies the properties outlined in Lemma 19. We observe that the optimal action is to activate in state Z = (L, 1, 0), ∀L : δ1 +δ2 ≤ L ≤ K. In words, the optimal policy is to activate the sensor during the On period, as long as the sensor has sufficient energy. In state Z = (L, 0, 0), the optimal action is to deactivate for all values of L ≤ K −γ. The exact value of γ depends on system parameters (recharge, discharge rates and K). In words, the optimal policy is to deactivate the sensor during the Off period (unless it has over-abundant energy). Once the sensor becomes inactive, it does not have any information about the event occurrence state of the system. Hence, during inactive states its decisions are based only upon its current energy level, the number of time slots it has been inactive for, and the most recent actively observed event occurrence state of the system. The sensor may get deactivated due to two different reasons. First, it may run out of energy, i.e., its energy level becomes < δ1 + δ2 during the On period. This scenario corresponds to states of the form Z = (L, 1, i). Second, it may decide to switch itself off (deactivate deliberately) on finding the system in Off period. The corresponding states are of the form Z = (L, 0, i). If the sensor dies (during the On period), it applies an aggressive wakeup strategy and tries to activate itself soon. However, when the sensor decides to deactivate itself deliberately, it applies a reluctant wakeup strategy. These two different strategies are described using non-linear functions f 0 and f 1 respectively. Function f 0 (t) : t ≥ 0 is a non-increasing function of t such that δ1 + δ2 ≤ f 0 (t) ≤ K. In the inactive state Z = (L, 0, t), the sensor has been inactive for t time slots, and it checks if its current energy level is greater than or equals f 0 (t). If 108 true, the sensor activates itself, otherwise it remains inactive for one more time slot. With each passing time slot, the threshold energy level f 0 (t) required for activation decreases (or remains same), until the sensor’s current energy level exceeds the threshold and the sensor is activated. Similarly, function f 1 (t) : t ≥ 0 is a nondecreasing function of t such that δ1 + δ2 ≤ f 1 (t) ≤ K. This function is applicable to the inactive states Z = (L, 1, t) in a manner similar to the function f 0 (t). Thus, the optimal actions are given by, µ∗ ((L, 1, t)) = 1 if L ≥ f 1 (t), 0 otherwise; µ∗ ((L, 0, t)) = 1 if L ≥ f 0 (t), 0 otherwise. (6.19) A typical plot of functions f 0 (t) and f 1 (t) for system parameters δ1 = c = 1, δ2 = off 2, pon c = 0.6, pc = 0.9, q = 0.1, K = 500 is shown in Figure 6.3(a). The probability that an event would occur in the next time slot, when sensor is in state (L, 0, t) (t) equals F0,1 . Similarly, the probability that an event would occur in the next time (t) slot, when sensor is in state (L, 1, t) equals 1 − F1,0 . From (6.14), (t) (t) lim F0,1 = π on = 1 − π off = 1 − lim F1,0 . t→∞ t→∞ Therefore, for the sensor with partial state information, the states (L, 0, t) and (L, 1, t) are equally rewarding if it decides to activate, for large values of t. Thus, the optimal action in states (L, 0, t) is the same as the optimal action in state (L, 1, t) for sufficiently large t. This explains the convergence of the functions f 0 (t) and f 1 (t) at large values of t. The rate of this convergence depends on the rate of convergence of F functions to their respective steady-state probabilities. Note that functions f 0 and f 1 satisfy the properties mentioned in Lemma 19. As the recharge rate increases, the threshold energy wakeup functions (f 0 , f 1 ) decrease (or remain same) at all values of sleep duration as shown in Figure 6.3(b). Thus the sensor’s energy threshold to activate in any state Z decreases as its recharge rate increases. We observe that the converged value of the functions f 0 , f 1 depends heavily on the off recharge rate. Figure 6.4 shows similar trend in the symmetric case pon c = pc = pc . Next, we develop a deterministic and memoryless activation algorithm and show that it achieves near-optimal performance in all practical scenarios. Note that 109 500 500 450 450 f0 400 300 250 200 300 250 200 150 150 100 100 50 50 0 0 10 20 30 Sleep duration (t) f1 350 Energy level (L) Energy level (L) 350 f0 400 1 f 40 50 (a) Low recharge rate (q = 0.1) 0 0 10 20 30 Sleep duration (t) 40 50 (b) High recharge rate (q = 0.5) Figure 6.3: Threshold energy wakeup functions. Optimal action is to activate if the current sensor energy level is at least as much as the threshold energy wakeup function value for the corresponding state. 500 450 400 f0 Energy level (L) 350 1 f 300 250 200 150 100 50 0 0 10 20 30 Sleep duration (t) 40 50 Figure 6.4: Threshold energy wakeup functions for symmetric case (pc = 0.8, q = 0.4) the class of deterministic, memoryless algorithms is exponentially large [51], since there are 3(K + 1) observations and 2 actions possible from each observation (except the observations with energy level L < δ1 + δ2 ), leading to an exponential number of (of the order O(23(K+1) )) deterministic policies. Computing the optimal deterministic policy may be intractable (NP-Complete) [51], and it may also require the sensor to keep track of its current energy level at all times. Hence, we try to formulate a near-optimal policy and compare its performance with known performance bound 110 given in (6.18). 6.3.2 Energy Balancing Correlation-dependent Wakeup Policies Recall the Correlation-dependent Wakeup (CW) policies defined in Section 6.1.3.2. We show that the CW policy utilizing a energy-balancing sleep duration, denoted ΠEB−CW , approaches the performance achieved by the optimal policy, when β and K are sufficiently large. To motivate the analysis, we make the following assumptions: (i) sensor never runs out of charge, and (ii) sensor does not loose any recharge quantum. Both of these assumptions represent the extreme scenarios when sensor’s energy level reaches either 0 or K. If K is sufficiently large, and the sensor operates in energy balance, i.e., its average discharge rate equals its average recharge rate, these extreme scenarios represent two rare events whose probability of occurrence is provably small. Based upon the above assumptions, we derive an upper bound on CW performance in Section 6.3.2.1, and analyze the performance of ΠEB−CW in Section 6.3.2.2. We consider the effects of the rare events (boundary conditions) in Section 6.3.2.3. 6.3.2.1 Upper Bound on CWP Performance Let us consider the CW algorithm utilizing a sleep duration of SI time slots. Consider the time interval between time slots t1 and t2 , where t1 and t2 are the two successive time slots at which the sensor starts sleeping. Since the sensor decides to sleep only during the Off period, i.e., when the event process goes to Off state, these two time instances at which the sensor enters the sleep state represent renewal instances of the sensor-event system state. Figure 6.5 depicts a typical renewal interval. Let Pe denote the probability that the event occurrence process is On (SI+1) when the sensor wakes up and polls the system. Therefore, Pe = F0,1 . Since on SI ≥ 0, from (6.14), (1 − poff c ) ≤ Pe ≤ π . Let Te = t2 − t1 , and V denote the number of events detected if the sensor finds the system in On period upon wakeup. From Section 6.1.1, we have, E[V ] = 1 . 1−pon c Therefore, E[Te ] = SI + Pe (E[V ] + 1) + (1 − Pe )1 = SI + 1 + Pe . 1 − pon c (6.20) 111 A A A A Y Y Y N I . . . . . I A A A Y Y N I SI | | | | | | | t1 t t2 Figure 6.5: Event occurrence process and sensor activation states for CW policy during a renewal interval. Here ‘A’, ‘I’ represent active and inactive sensor states, ‘Y’, ‘N’ represent event occurrence and otherwise, and t1 , t2 are renewal instances. Lemma 20 The maximum achievable performance for any CW policy ΠCW is upper qc ∗ b (ΠCW ) ≤ on bounded as, U . on = U π CW (δ2 +δ1 )+δ1 (1−pc ) Proof: Consider a renewal interval for the CW policy ΠCW . Let n1 denote the number of times the sensor moves from sleep to active state during this interval. Let P (n1 ) denote the probability that an event occurs when the sensor activates itself after waking up. Similarly, n2 denotes the number of times the sensor moves from active state to active state during the interval, and P (n2 ) denotes the probability that an event occurs when the sensor activates itself while in active state. We have, (SI+1) n1 = 1, P (n1) = Pe = F0,1 , n2 = Pe V, and P (n2 ) = pon c . The success probability during a renewal interval is given by, Ps = n1 P (n1 )+n2 P (n2 ) n1 +n2 = Pe +Pe V pon c . 1+Pe V We have, on pc 1 + 1−p on π on Pe + E[V ]Pe pon Pe c c = 1 ≤ .(6.21) E[Ps ] = = on 1 + Pe E[V ] 1 − pc + P e 1 − pon + 1−p1 on c +π Pe c Since the renewal intervals are i.i.d. sequences, the steady-state success probability of the sensor equals the expected value of success probability achieved during one Ps qc qc b (ΠCW ) ≤ 1on interval. From (6.7), U ≤ πon (δ2 +δ1 )+δ on . π δ1 +Ps δ2 1 (1−p ) c Note that this performance approaches the (loose) upper bound on achievable performance (given by Lemma 13) as pon c → 1. The bound given by Lemma 13 still 112 serves as an upper bound for any activation algorithm in the partially observable scenario. However, the presence of imperfect observations imply that this bound is loose. A tighter performance bound is given by (6.18). The fact that the maximum achievable performance of any CW policy is also bounded by (6.18), depicts the possibility of existence of an optimal policy in the class of CW policies. 6.3.2.2 Performance of EB-CW policy Now, we show that for large sensor energy bucket size K, the performance of ΠEB−CW approaches the performance bound given by Lemma 20. ΠEB−CW employs an energy-balancing sleep duration SI ∗ , which enables the sensor to spend as much energy during the course of its operation, as is gained by the sensor through recharge. Consider a renewal interval whose length is denoted by Te . Let E1 and E2 denote the number of recharge and discharge quanta respectively during the interval. Using (6.20), we have, E1 = qcE[Te ] = qc SI ∗ + 1 + Pe Pe ; E2 = δ1 + (δ1 + δ2 ). on 1 − pc 1 − pon c (6.22) Equating E1 = E2 , we get, E[Te ] = δ1 Pe (δ1 + δ2 ) δ1 Pe (δ1 + δ2 − qc) + ; SI ∗ = + − 1. on qc qc (1 − pc ) qc qc (1 − pon c ) (6.23) The expected number of events detected during the interval equals Pe E[V ]. Since the renewal intervals are i.i.d., as T → ∞, Ed (T ) = Pe E[V ]T E[Te ] and Eo (T ) = π on T . From (6.1), the performance of ΠEB−CW , is given by, qc b (ΠEB−CW ) = Pe E[V ] = h U on π E[Te ] π on δ2 + δ1 1 + 1 Pe (1 − pon c ) i . (6.24) We compare this performance with the maximum achievable performance (loose upper bound) given by Lemma 13, and with the bound given by Lemma 20. Recall that δ2 = βδ1 . Theorem 21 Energy-balancing CW policy, ΠEB−CW , achieves the following per- 113 formance bounds: Proof: on β + 1 β+1 ∗ p qc 1 c b EB−CW ) ≥ (i) U(Π ≥ U . 1 δ1 + pon β + πon π on β + π1on CW c δ2 b EB−CW ) ≥ Pe U ∗ . (ii) U(Π π on CW 1−pon δ1 c Since, Pe ≥ 1 − poff , we have, δ + . Also, δ2 + ≥ δ + δ 1 + 2 on 2 1 c Pe i π 1 h β+ πon β 1 . Thus, we have, + (β+1)π = (δ1 + δ2 ) β+1 (δ1 + δ2 ) β+1 on δ1 π on = qc qc i ≥ h 1 π on δ2 + πδon ) π on δ2 + δ1 1 + P1e (1 − pon c β+1 β+1 β+1 qc pon c qc ∗ = ≥ ≥ UCW . 1 1 1 on on on β + πon π (δ1 + δ2 ) β + πon π (δ1 + δ2 pc ) β + πon b (ΠEB−CW ) = U The second inequality follows from (6.8). The last inequality follows since pon c ≥ h i on 1−pon 1−p 1 on on Pe ≥ π . Since Pe ≤ π , πon δ2 + δ1 1 + Pec ≤ δ2 + δ1 1 + πonc . This im2 e “ 1 “ 1 ” . Now the bound follows from (6.24). ” ≥ Pon plies, 1−pon 1−pon π c c δ2 +δ1 1+ Pe δ2 +δ1 1+ π on Thus ΠEB−CW achieves near-optimal performance. In fact, from Theorem 21, ∗ UCW . it achieves performance ≥ 1 − o β1 6.3.2.3 Performance Effects of Boundary Conditions The performance analysis of ΠEB−CW does not consider the following rare events: (i) the sensor may not be able to detect V events continuously, since it may run out of its charge completely and enter the dead state, and (ii) the sensor may loose some recharge quanta due to the fact that its energy bucket was full at the time the recharge quanta arrived. However, with sufficiently large bucket size K, and utilizing an energy-balancing sleep duration, both the above rare events occur with a very small probability. Hence the policy ΠEB−CW utilizing a deterministic, constant sleep duration given by (6.23) achieves the performance given by (6.24). Let Te denote the length of a renewal interval for the policy ΠEB−CW . Let X denote the number of recharge quanta received by the sensor, and Y denote 114 the number of energy quanta spent by the sensor during the interval. We have, Pr[X = ic|Te ] = Tie q i (1 − q)Te −i , 0 ≤ i ≤ Te . This implies, E[X|Te ] = Te qc. Using (6.23), E[X] = E[E[X|Te ]] = E[Te ]qc = δ1 + Pe (δ1 + δ2 ) . (1 − pon c ) Similarly, Pr[Y = δ1 ] = 1 − Pe . Since Y ≤ X + K, assuming K (δ1 +δ2 ) (6.25) → ∞, we have, on (i−1) Pr[Y = δ1 + i(δ1 + δ2 )] = Pe (1 − pon , i ≥ 1. This implies, c ) (pc ) E[Y ] = δ1 + Pe (δ1 + δ2 ) = E[X]. (1 − pon c ) (6.26) Similarly, we have, 2 Pe (δ1 + δ2 ) 2 2 . Var[X] = E E X |Te − (E [X|Te ]) = c q(1−q)E [Te ] = c(1−q) δ1 + (1 − pon c ) (6.27) 2 2 (δ1 + δ2 ) on (6.28) Var[Y ] = E Y − (E[Y ])2 = 2 Pe (1 + pc − Pe ) . on (1 − pc ) Let Te1 , Te2 , . . . , TeM denote the M successive renewal intervals. Let (X1 , Y1 ), . . . , (XM , YM ) denote the number of quanta received and spent respectively during these M intervals. Note that Xi , Yi: 1 ≤ i ≤ M are i.i.d. random variables. Let us define M random variables, Zi = Xi − Yi , 1 ≤ i ≤ M. We have, E[Zi ] = E[Xi ] − E[Yi ] = 0, 1 ≤ i ≤ M, and Var[Zi ] = E [Zi2 ] = E [Xi2 ] + E [Yi2 ] − 2E[Xi ]E[Yi ]. Since PM Z i . E[X] = E[Y ], we have, Var[Zi ] = Var[Xi ] + Var[Yi ], 1 ≤ i ≤ M. Let Z̄ = i=1 M (Var[X]+Var[Y ]) Note that, E[Z̄] = 0, and Var[Z̄] = . From Chebyshev’s inequality M [55], we have, Pr[|Z̄ − E[Z̄]| ≥ ǫ] ≤ Var[Z̄] . This implies, ǫ2 Pr[|Z̄| ≥ ǫ] ≤ Var[X] + Var[Y ] . Mǫ2 (6.29) Let us assume that the sensor had an energy level of K2 at the beginning of interval PM P K Te1 . We observe that if for any M, M i=1 Yi > 2 , then some energy quanta i=1 Xi − get dropped because of the sensor energy bucket being full. On the other hand, if PM PM K i=1 Xi > 2 , then the sensor dies while detecting events during the On i=1 Yi − period. Both the above cases represent the occurrence of the two rare events outlined above respectively. Assuming the sensor energy level at any time to be O(K), we 115 would like to bound the probability of the event |Z1 + Z2 + . . . + ZM | > O(K) or |Z̄| > O(K) . M Towards this end, we obtain, P r[|Z̄| ≥ M(Var[X] + Var[Y ]) O(K) ]≤ . M O(K 2 ) (6.30) Let us assume that for some M, one of the rare events occur and the sensor dies while detecting events. Let E¯ denote the number of events which the sensor misses to detect. It can be shown that E[Ē] ≤ (δ1 +δ2 )π on . qc Total number of events expected to occur in M renewal intervals is given by ME [Te ] π on . The fractional loss in δ1 +δ2 utility of the policy ΠEB−CW due to the occurrence of a rare event is ≤ qcM . E[Te ] M (Var[X]+Var[Y ]) from (6.30). The probability that such a rare event occurs is ≤ O(K 2 ) Thus the expected fractional loss in utility of ΠEB−CW , denoted τ , is given by, τ≤ (δ1 + δ2 ) (Var[X] + Var[Y ]) . qcE [Te ] O(K 2 ) (6.31) Thus, when K is sufficiently large, the policy ΠEB−CW achieves a performance quite close to that given by (6.24), with the error difference being of the order o K12 . off Example 1: Let pon c = 0.6, pc = 0.9, c = 1, q = 0.5, δ1 = 1, δ2 = 6, K = 500. Using Pe ≈ π on = 0.2, E[Te ] = 9, Var[X] = 2.25, and Var[Y ] = 85.75. Thus, τ≤ 136 O(K 2 ) 6.3.3 ≈ 0.0005. Performance of AW Policy Intuitively, one would expect that the AW policy (defined precisely in Section 6.1.3.1) does not perform better than the CW policy, since it does not take into account the degree of temporal correlations in the event occurrence process while taking activation decisions. Lemma 22 proves this fact explicitly. ∗ Lemma 22 The maximum achievable performance for AW policy, UAW = qc . δ1 +π on δ2 Proof: Under aggressive wakeup, the sensor wakes up as soon as it has energy ≥ δ1 + δ2 quanta. Assuming, δ1 > c, the sensor would remain active for only one time slot upon wakeup, irrespective of the occurrence of event in the time slot. If an event 116 Figure 6.6: Event occurrence upon activation for AW policy occurs during the time slot the sensor was active, the sensor moves to the dead state for an expected duration of δ1 +δ2 qc time slots. Similarly, if an event does not occur while the sensor was active, the sensor moves to the dead state for an expected duration of δ1 qc time slots. The probability that the event occurrence state of the system has changed between two successive activations of the sensor under AW +δ2 +1) ( δ1qc , and policy can be represented using the F functions. In Figure 6.6, p1 = F1,0 ( δqc1 +1) (i) (i) off p2 = F0,1 . Using (1 − pon F1,0 , (6.14) and the fact that δ2 > 0, c ) F0,1 = 1 − pc we have, +δ2 +1) ( δ1qc p1 = F1,0 = 1 − pon c off 1 − pc +δ2 +1) ( δ1qc F0,1 ≥ 1 − pon c off 1 − pc ( δqc1 +1) F0,1 = 1 − pon c off 1 − pc p2 . (6.32) From Figure 6.6, the steady-state probability Pr[Et |ut] is given by, Pr[Et |ut ]p1 = (1 − Pr[Et |ut ])p2 , which implies the success probability for the AW policy, Ps = Pr[Et |ut ] = p2 p1 +p2 ≤ 1−poff c off 2−pon c −pc = π on . This inequality can also be shown to hold for any activation policy which does not take temporal correlations in the event process into account while deciding upon activation. The performance bound now follows from (6.7). Note that the upper bounds given by Lemmas 13, 20 and 22 are the same for 1 on off off pon c = pc = 2 . As the correlation probabilities pc , pc increase from 1 2 to 1, the 117 bounds are in the following order: ∗ UAW ≤ ∗ UCW ≤ 1 π on pon c qc δ1 + pon c δ2 . Thus the AW policy is expected to perform worse than the best CW policy, for 1 2 < 1 off on off pon c , pc < 1. We observe that the AW policy may perform well when pc = pc = 2 . We consider this case in Section 6.4. 6.3.4 Simulation Results Simulations are performed for various system parameters for the CW activa- tion algorithm. In addition, we simulate the performance of the CW policy under the assumption that the sensor has been provided with all its energy in advance. For instance, if the total time of simulation is T , then the sensor is expected to receive T qc recharge quanta during the course of its operation. We provide the sensor with an initial reserve of energy equal to K + T qc and disable its recharge at individual time slots during the time [0 . . . T ]. Note that this represents the case of a nonrechargeable sensor put to activation in a temporally correlated event occurrence environment. Furthermore, with the sensor having all its energy in advance, the rare events outlined in Section 6.3.2.3 will never occur and hence the sensor would be able to achieve the performance given by (6.24) at the sleep duration specified by (6.23). Figures 6.7 depicts the performance of CW policy utilizing a sleep duration SI. AW represents the performance achieved by the AW policy. Performance achieved by the activation algorithm corresponding to the CW policy in the non-rechargeable case (where the sensor has all the energy in advance) is represented by CW (NR). off Figure 6.7(c) plots the performance for symmetric case pon c = pc = pc . The system parameters used are q = 0.5, c = 1, δ1 = 1, δ2 = 6, K = 2400. We observe that the optimal sleep duration for CW (NR) algorithm also corresponds to the optimal sleep duration for the CW algorithm. We also observe that CW (NR) algorithm performs better than CW algorithm at all sleep durations. This is expected since the sensor does not loose recharge quanta due to its energy bucket being full and almost never enters the dead state while operating under the CW (NR) algorithm. Note that 118 0.16 Fraction of events detected Fraction of events detected 0.25 0.2 0.15 0.1 CW CW (NR) AW * UCW 0.14 0.12 0.1 CW CW (NR) AW U* CW 0.08 U* U*AW AW 0.05 0 10 20 30 Sleep Interval (SI) 40 0.06 0 50 off ∗ (a) pon c = 0.6, pc = 0.9, SI = 7 Fraction of events detected Fraction of events detected 0.11 0.08 0 40 50 0.1 0.12 0.09 20 30 Sleep Interval (SI) off ∗ (b) pon c = 0.7, pc = 0.8, SI = 18 0.13 0.1 10 CW CW (NR) AW U* CW U*AW 10 CW CW (NR) AW U* 0.09 CW 0.08 U*AW 0.07 0.06 0.05 20 30 Sleep Interval (SI) 40 (c) pc = 0.75, SI ∗ = 27 50 0.04 0 10 20 30 Sleep Interval (SI) 40 50 off ∗ (d) pon c = 0.6, pc = 0.9, q = 0.1, δ2 = 2, SI = 23 Figure 6.7: Performance of CW Policies. At the energy-balancing sleep duration given by SI ∗ , the CW policy achieves near-optimal performance. once the sensor dies, it would never activate again due to the absence of recharge under CW (NR). The performance for CW (NR) approaches the upper bound at a lower value of sleep duration, since the energy lost in polling often does not impact the CW (NR) algorithm as much as it impacts the CW algorithm. At larger sleep durations both the CW (NR) and CW algorithms perform very close to each other. This is because at large sleep durations, the recharge rate is significantly larger than the discharge rate for the CW algorithm, and hence the sensor operating under CW algorithm is able to activate (and deactivate) itself at exact time instances as those of the CW (NR) algorithm employing the same sleep duration. Figure 6.7(d) plots 119 the performance for the same set of parameters as in Figure 6.3(a). We observe that the optimal sleep duration SI ∗ is quite close to the value of t for which functions f 0 and f 1 converge in Figure 6.3(a). At the sleep duration SI ∗ calculated from (6.23) using Pe = π on , both CW (NR) and CW perform very close to UB. Thus ΠEB−CW achieves near-optimal performance in these scenarios. 6.4 Temporally Uncorrelated Event Occurrence We briefly discuss the case where the occurrence of events of interest is not correlated in time. An event occurs in a time slot with probability p, independent of off occurrence of events in previous time slots. Note that pon c = pc = 0.5 corresponds to the case when p = 0.5. Note that the results in this section apply to both completely observable as well as partially observable systems. The average recharge rate of the sensor is qc, while the average discharge rate of an active sensor is δ1 + pδ2 . Consider the case where the average recharge rate is greater than the average discharge rate during the activation period. That is, qc ≥ δ1 +pδ2 . In this impractical scenario, the AW policy is optimal and achieves a performance quite close to 1, as depicted in this example. Example 2: Let c = 2, q = 0.8, p = 0.4, δ1 = δ2 = 1. • K AW CW (SI= 1) CW (SI= 4) • 10 0.996522 0.624423 0.295716 • 100 1.0 0.624458 0.295751 Intuitively, when K is sufficiently large, and recharge rate is at least as much as the discharge rate, sensor’s energy bucket would almost never reach a zero level, and hence the sensor would be able to detect all the events operating under the AW activation algorithm. Note that this result would hold even in the presence of temporal correlations in the events process, as long as the recharge rate exceeds the discharge rate during the active state. In practical situations, the average recharge rate of a sensor would be considerably smaller than the average discharge rate. That is, qc < δ1 + pδ2 . Intuitively 120 in this scenario, a sensor would be unable to detect all the events occuring in the region, irrespective of its energy bucket size K. Lemma 23 The maximum achievable performance for any activation policy in the absence of temporal correlations across events, is given by qc . (δ1 +pδ2 ) Proof: The success probability for any policy Π satisfies, Ps = P r[Et |ut ] = P r[Et ] = p. Since the events are uncorrelated in time, Eo (T ) = T p and Ed (T ) = T1 p. Therefore, using (6.6), we have, qc b (Π) = lim Ed (T ) = lim T1 ≤ . U T →∞ T T →∞ Eo (T ) δ1 + pδ2 Lemma 24 AW policy achieves asymptotically optimal performance in the absence of temporal correlations across events. Proof: Let Lt denote the charge level of sensor at time t under the AW policy ΠAW . Let ps denote the steady-state probability of the event that at a particular time t, the sensor has sufficient energy to detect an event i.e., ps = Pr(Lt ≥ δ1 + δ2 ). We have, E[Lt+1 ] = ps (Lt + qc − δ1 − pδ2 ) + (1 − ps )(Lt + qc) = Lt + qc − ps (δ1 + pδ2 ). Assuming some initial charge L0 , the expected charge at time T is given by, E[LT ] = L0 + T [qc − ps (δ1 + pδ2 )]. Note that we are assuming that the sensor does not loose any charge due to its energy bucket being full at any time, which would be true when the sensor energy bucket size K is sufficiently large, and the average discharge rate during the activation period is larger that the average recharge rate of the sensor. (Note that a more detailed analysis by taking K into account, similar to that described in section 6.3.2.3, is possible here as well but is ommitted for brevity.) Since 0 ≤ LT ≤ K, dividing by T and taking limits as T → ∞, we have, K L0 + qc − ps (δ1 + pδ2 ) ≤ lim = 0. T →∞ T T →∞ T 0 ≤ lim (6.33) 121 Therefore, ps = qc . (δ1 +pδ2 ) For sufficiently large K, the steady-state probability that the sensor is available during a particular time is given by ps . Therefore, T1 = ps T , b (ΠAW ) = limT →∞ ps T p = ps = Ed (T ) = T1 p = ps T p, and Eo(T ) = T p. Thus, U Tp qc , (δ1 +pδ2 ) 6.5 which is the maximum achievable performance from Lemma 23. Summary and Conclusions We have considered the node activation question for a rechargeable sensor in the presence of temporal correlations in the sensed phenomena. A simple and straightforward aggressive wakeup policy is optimal when any of the following conditions hold: (a) Average discharge rate during activation is no greater than the average recharge rate, or (b) Event occurrence is uncorrelated in time (Lemmas 23, 24). When events of interest are temporally correlated and the rate of recharge is significantly less than that of discharge, smart sleeping is very effective in improving the overall system performance. We formulate the sensor activation problem under complete observability as a MDP, bound the maximum achievable performance in Lemma 13, and outline the ǫ-optimal solution to the optimality equations in Theo rem 14, where ǫ ∼ o K1 and K is the sensor energy bucket size. Under complete observability, the sensor should deactivate itself for one time slot when it detects the event process to be in the off state. We show that such a policy is asymptotically optimal w.r.t. K in Corollary 15. We formulate the sensor activation problem under partial observability as a POMDP, transform it into a countable state-space MDP in Lemma 16, and outline the ǫ-optimal solution to the optimality equations in The orem 18, where ǫ ∼ o β1 and β is the ratio of detection cost to operational cost. We characterize some properties of near-optimal policies in Lemma 19, and observe that the optimal policy evaluated using value iteration for sample cases satisfies these properties (Section 6.3.1.3). We upper bound the performance of CW policies in Lemma 20, and of AW policies in Lemma 22. Since the optimal policy structure is highly sensitive to system parameters, we focussed on developing a near-optimal policy (EB-CW) which is relatively invariant to system parameters. Under partial observability, the sensor should employ an appropriate sleep duration derived using 122 energy balance during a renewal interval of the sensor operation. We show that such an activation policy (EB-CW) achieves optimal performance when β and K are sufficiently large (Theorem 21). CHAPTER 7 Effect of Temporal Correlations in multiple-sensor systems We have considered the node activation question for a single rechargeable sensor node operating under temporally correlated event occurrence phenomena in Chapter 6. In this chapter, we consider a sensor system employing multiple rechargeable sensors to detect the events of interest in the region. The utility provided by multiple active sensors detecting an event depicts diminishing returns with respect to the number of active sensors covering the region (refer Figure 3.2). The sensors are required to contribute independently but collaboratively towards fulfilling a global objective in the network. To this end, sensors are required to add constructively to the performance of the overall network. In addition, due to the tiny, low-cost nature of these devices, the sensors may not have complicated information processing or communication capabilities. Thus the sensors need to take decisions in the presence of limited and localized information only. Hence, there is a need to design simple and local information based algorithms for the operation of these sensor nodes. Note that the sensor nodes must take decisions based only upon local information and in an online manner, due to the randomness in the event process and in the recharge processes at the sensor nodes. Also smart node activation decision policies should take into account the degree of temporal correlation in, and the current status of the event process, while deciding to activate or deactivate (put to sleep) the sensor nodes dynamically. We focus on threshold based activation policies and show that they achieve near-optimal performance, thus exhibiting their robustness even in the presence of temporal correlations in the sensed phenomena. The chapter is organized as follows. Section 7.1 formulates the problem in terms of multiple sensors covering the region, and discusses the various activation algorithms considered. Section 7.2 presents the performance evaluation of various threshold based activation decision policies. We summarize our results in Section 7.3. 123 124 7.1 Problem Formulation Each sensor has an energy consumption model as described in Section 6.1. Also, events of interest occur in the region in a manner as described in Section 6.1. First, we formulate a metric to evaluate the performance of a node activation algorithm. Next, we outline the various threshold based activation algorithms considered. 7.1.1 Performance Metric We utilize a continuous, non-decreasing, strictly concave utility function U to measure the performance of the activation algorithm. Let N be the total number of sensors covering the region of interest. Let pd denote the individual event detection probability of each sensor node. Then at any time t, if the number of active sensors is nt , the probability that an event occuring at time t gets detected is given by 1 − (1 − pd )nt . Note that the detection probability is zero when the number of active sensors is zero, and increases with dimishing returns as nt increases from 0 to N. Figure 3.2 depicts the shape of this utility function for various values of detection probability pd . The quality of coverage measured over time interval [0 . . . T ] is formulated as a utility metric Ū(Π), where Π is the activation algorithm used by the N sensors in the region. Let nt denote the number of active sensors in the region at time slot t under activation policy Π. Let xt be the indicator variable denoting the occurrence of an event in time slot t. Then the utility metric Ū(Π) is given by, Ū(Π) = lim T →∞ PT xt U(nt ) PT t=1 xt t=1 (7.1) The decision problem is that of finding activation policy Π such that the objective function in (7.1) is maximized. 7.1.2 Threshold based Activation Policies Threshold based activation policies are described in detail in Section 3.3.2. Un- der temporally correlated events, smart threshold based schemes might employ two different thresholds, a preferably larger threshold during the On periods, whereas a 125 smaller threshold during the Off periods. Threshold based schemes are simpler to deploy in a practical network since they require minimal state information and can be realized based only upon local information, as discussed in Chapter 4. We consider two different threshold activation algorithms, namely Time-invariant Threshold Policy (TTP) and Correlation-dependent Threshold Policy (CTP). The TTP algorithm is oblivious to the current state of event process and targets a constant threshold parameter in the region at all times, while the CTP algorithm employs two different threshold parameters, during the On and Off periods respectively. 7.1.2.1 Time-invariant Threshold Policy (TTP) All ready sensors are considered available for activation in a particular time slot. In each time slot a threshold of m is targeted from among the available sensors. Thus this activation policy does not take the temporal correlations in the event process into account while making activation decisions. 7.1.2.2 Correlation-dependent Threshold Policy (CTP) All ready sensors are considered available for activation in a particular time slot. However, the threshold parameter depends on the state of the event process in the region. A threshold of m is targeted in time slot k from among the available sensors, if the event process is known to be in the On state, i.e. if an event had occurred in the previous time slot (k −1). On the other hand, if an event did not occur in time slot (k −1), a threshold of n (≤ m) is targeted in the k th time slot. Thus a CTP policy applies a time-varying threshold, which is adjusted depending upon the state of event occurrence in the system. Intuitively, the CTP policy tries to conserve the energy of the sensors during the Off periods, in order to be able to activate them to gain more utility later during the On periods. Note that the TTP is a special case of CTP, with n = m. 126 7.2 Performance Evaluation off The system parameters q, c, δ1, δ2 , pon c , pc , and K remain the same across the sensor nodes. The performance of the activation algorithm is measured using (7.1). We derive an upper bound on achievable performance in Section 7.2.1. We analyze the performance of various threshold based activation schemes in Section 7.2.2 and present simulation results in Section 7.2.3. 7.2.1 Upper Bound on Achievable Performance Lemma 25 The maximum achievable performance for any activation algorithm Π in a system with N sensors is bounded as Ū(Π) ≤ U 1 π on pon c Nqc δ1 + pon c δ2 . Proof: Let f and p be measurable functions finite a.e. on a set R. Suppose fp and p R are integrable on R, p ≥ 0, and R p > 0. If φ is convex in an interval containing the range of f, then Jensen’s inequality [76] states R R φ(f )p fp R ≤ RR φ R p p R R Recall that nt denotes the number of sensors in the active state at time slot t. Since U(·) is concave, substituting φ = U(·), f = nt and p = xt in the above, Jensen’s Inequality in the discrete state-space implies U PT nt xt Pt=1 T t=1 xt Since U(·) is continuous, we have, Ū (Π) = lim T →∞ PT ! ≥ PT U(nt )xt . PT x t t=1 t=1 t=1 U(nt )xt ≤ lim U PT T →∞ t=1 xt PT nt xt Pt=1 T t=1 xt ! . (7.2) 127 Define ψi (t) such that ψi (t) = 1 if sensor i is in active state at time slot t and ψi (t) = 0 otherwise. Then, Ū (Π) ≤ U lim T →∞ PT nt xt Pt=1 T t=1 xt ! =U lim = U = U t=1 T →∞ ! N PT X x ψ (t) t i t=1 =U lim P T T →∞ t=1 xt i=1 ! PT x ψ (t) t i t=1 N lim . P T T →∞ t=1 xt ! P xt N i=1 ψi (t) PT t=1 xt ! PT N X x ψ (t) t i t=1 lim P T T →∞ t=1 xt i=1 PT The last equality follows from the fact that all sensors are identical. Also from Lemma 13, we have, lim T →∞ PT t=1 xt ψi (t) P T t=1 xt ≤ 1 π on pon c qc δ1 + pon c δ2 . (7.3) Now, since U(·) is non-decreasing, Ū (Π) ≤ U 1 π on pon c Nqc δ1 + pon c δ2 . From Lemma 22, for a sensor operating under AW policy (or an algorithm which does not take temporal correlations into account), we have, lim T →∞ PT qc t=1 xt ψi (t) ≤ . P T (δ1 + π on δ2 ) t=1 xt Using the above in place of (7.3) in Lemma 25, we have, Corollary 26 The performance for any activation algorithm Π which does not take into account the amount of temporal correlation in the event process is bounded as Ū (Π) ≤ U 7.2.2 Nqc δ1 + π on δ2 . Performance of Threshold Policies Let us analyze the performance of a CTP policy employing a threshold of (m, n), denoted Πm,n . (As we noted earlier, the performance of a TTP algorithm 128 with parameter m is the same as that of a CTP algorithm employing a threshold of (m, m).) Consider an Off period of length T1 followed by an On period of length T2 in the event process. During the first time slot of the Off period, since an event occured during the previous time slot, a threshold of m is targeted by the threshold activation policy. During the next T1 − 1 time slots, a threshold of n is targeted. Also, during the first time slot of the On period, since an event did not occur during the previous time slot, a threshold of n is targeted. During the next T2 − 1 time slots, a threshold of m is employed. From (6.2), we have, E[T1 ] = 1 1 ; E[T2 ] = . off 1 − pc 1 − pon c Expected amount of energy gained by the sensor system during the Off-On cycle, denoted E1 , is given by off Nqc 2 − pon c − pc E1 = NqcE[T1 + T2 ] = . off (1 − pon c ) (1 − pc ) The expected amount of energy spent by the sensors in the system during this OffOn cycle (assuming that the targeted threshold was always met), denoted E2 , is given by E2 = mδ1 + nδ1 (E[T1 ] − 1) + n(δ1 + δ2 ) + m(δ1 + δ2 )(E[T2 ] − 1) off on pc pc + nδ1 + mδ1 + n(δ1 + δ2 ) = m (δ1 + δ2 ) on 1 − pc 1 − poff c From [32, 38] we know that for partially rechargeable sensors with sufficiently large energy bucket size K, a threshold activation policy Π employing an energy-balancing threshold of m, such that the average recharge rate equals the average discharge rate in the system, achieves a performance Ū (Π) → U(m) as K → ∞. Intuitively, this happens because the rare events outlined in Section 6.3.2.3 occur with a small probability which approaches zero as K becomes large. Thus, sensors operating in steady-state when an energy-balancing threshold of m is applied, leads to a timeaverage performance of U(m). In other words, the applied threshold will almost always be met. We only consider energy-balancing TTP and CTP policies hereafter 129 in this Chapter. For energy balance during the cycle, equating E1 = E2 , we get, m (δ1 + δ2 ) pon c 1 − pon c + nδ1 poff c 1 − poff c off Nqc 2 − pon c − pc + mδ1 + n(δ1 + δ2 ) = . off (1 − pon c ) (1 − pc ) (7.4) The expected performance achieved during this cycle (assuming that m, n satisfy the energy balancing constraint ((7.4) above), is given by Ū (Πm,n ) = U(n) + U(m)(E[T2 ] − 1) on = U(n)(1 − pon c ) + U(m)pc . E[T2 ] (7.5) Since these Off-On cycles are i.i.d. and the expected performance during each of these cycles (under energy balancing threshold-pair (m, n)) is the same, Ū (Πm,n ) above is also the time average performance of the threshold algorithm. Thus, the optimal (energy-balancing) CTP policy maximizes the performance objective− max m,n on U(n)(1 − pon c ) + U(m)pc (7.6) subject to the constraint given by (7.4). on Note that for large values of pon c i.e. pc = 1−ǫ, or for small values of detection probability pd , a threshold policy with n = 0 would be optimal. Putting n = 0 in (7.4) gives, m= 1 π on Nqc . δ1 + pon c δ2 (7.7) opt Let Πopt T T P denote the optimal TTP algorithm, and ΠCT P denote the optimal CTP algorithm. Let U ∗ denote the upper bound to maximum achievable performance for any activation algorithm, given by Lemma 25. Let β = δ2 . δ1 Then, we have the following result. Lemma 27 The optimal TTP threshold and performance are given by Nqc . δ1 + π on δ2 Nqc β+1 opt (ii) Ū ΠT T P = U ≥ U ∗. δ1 + π on δ2 β + π1on (i) mopt TTP = Proof: 130 Putting m = n in (7.4) gives the energy balancing TTP threshold of m = N qc . δ1 +π on δ2 N qc The performance achieved using this threshold is given by U( δ1 +π on δ ), which is the 2 optimal TTP performance from corollary 26. Since U(·) is a strictly concave non-decreasing function, we have pon N qc N qc 1 c for X ≤ Y . Substitute, X = δ1 +π and Y = to get on δ on π δ1 +pon δ2 2 U (X) X ≥ U (Y ) Y c X π on (δ1 + pon c δ2 ) ∗ Ū Πopt U(Y ) = on U T T P = U(X) ≥ on Y pc (δ1 + pi δ2 ) ! β + p1on β+1 ∗ c = U ≥ U∗ 1 1 β + πon β + πon Note that X ≤ Y since pon c ≥ 1 2 ≥ π on . The last inequality follows since pon c < 1. on Note that when pon = 21 , Πopt c = π T T P achieves optimal performance. In Chapters 3 and 5, we have shown that the threshold activation policies are robust to spatial correlation in the recharge and/or discharge processes across the sensor nodes. Similar trends are observed here for the case of temporal correlations in the events process. A simple threshold policy (namely the TTP policy with a N qc ) δ1 +π on δ2 is robust to the amount of temporal correlations present, β+1 and achieves a performance ≥ β+ 1 U ∗ , for all values of system parameters. threshold of m = π on A TTP algorithm is also simpler to use in practice. Since the threshold param- eter remains constant over time, it requires minimum state maintenance overhead at the sensing devices. Next, we show that the optimal CTP algorithm achieves near-optimal performance. Theorem 28 The optimal CTP Πopt CT P achieves the following performance bounds: ≥ Ū Πopt TTP . β+1 opt on U ∗. (ii) Ū ΠCT P ≥ max pc , β + π1on (i) Ū Πopt CT P Proof: on on From (7.6), Ū Πopt CT P = maxm,n U(n)(1 − pc ) + U(m)pc such that (m, n) satisfy 131 the constraint (7.4). Put m = n in (7.4) to get m′ = n′ = N qc . δ1 +π on δ2 Thus, opt ′ on ′ on ′ Ū Πopt CT P ≥ U(n )(1 − pc ) + U(m )pc = U(m ) = Ū ΠT T P . Put n = 0 in (7.4) to get m′ = Ū Πopt CT P 1 π on N qc . δ1 +pon c δ2 Thus, on on ′ = max U(n)(1 − pon c ) + U(m)pc ≥ pc U(m ) m,n on 1 1 Nqc pc Nqc on on ∗ = pc U ≥ pc U = pon c U on on on on π δ1 + pc δ2 π δ1 + pc δ2 The last inequality follows from the non-decreasing nature of the utility function U(·) and since pon c < 1. The result now follows from Part (i) and Lemma 27. We observe that as pon c → 1, the best (m, n) threshold policy achieves the 1 optimal performance. Also when pon c → 2 , the optimal TTP performance (and hence the best (m, n) threshold policy performance) approaches the optimal performance. 1 on For intermediate values of pon c s.t. 2 < pc < 1, the performance is governed by the β+1 . Note that a (m, n) threshold is useful when the temporal factor max pon c , β+ 1 π on β+1 on correlation parameter pon 1 . Then c is known (or can be estimated) and pc > β+ on π N qc on ∗ employing the threshold pairs of ( π1on δ1 +p on δ , 0) leads to a performance ≥ pc U . 2 c For given system parameters, the optimal pair of (m, n) thresholds can be found by solving the optimization problem in (7.6). 7.2.3 Simulation Results Typically the recharge processes at the sensor nodes would depend upon re- newable energy sources, Hence recharge patterns at sensor nodes located in spatial vicinity of each other are bound to show some degree of spatial correlation. In this chapter, since the N sensors are considered to be co-located in the same region of interest, their recharge processes may be completely correlated with each other. For instance, if one sensor receives a charge amount c in time slot k, then all the other sensors would also receive the same amount of charge during this time slot. We consider both the extreme scenarios in our simulations, one where the recharge processes at the different sensors are independent of each other, and the other where 132 (m,n) (5,0) (4,4) pd =0.5 0.871875 0.9375 pd =0.1 0.368559 0.3439 Table 7.1: Performance for various threshold pairs. Both (5, 0) and (4, 4) are energy-balancing threshold pairs. For pd = 0.1, threshold pair (5, 0) achieves better performance, while for pd = 0.5, threshold pair (4, 4) achieves better performance. they are completely correlated. • Independent Recharge: A sensor receives a recharge quantum in time slot k with a probability q, independent of a recharge quantum being received in time slot k at any other sensor. The probability of recharge q, however, is kept constant across the sensor nodes. • Correlated Recharge: With probability q, all the sensors receive a recharge quantum in time slot k simultaneously. And with probability 1−q, none of the sensors receive a recharge quantum in time slot k. Figure 7.1 depicts the performance of the various threshold activation policies off discussed earlier in the chapter. The system parameters are N = 16, pon c = 0.9, pc = 0.9, q = 0.5, c = 2, δ1 = 1, δ2 = 6, K = 2400. The individual probability of detection pd is varied for correlated and independent recharge among the sensor nodes. TTP refers to the TTP policy employing a threshold of m, while CTP refers to a CTP policy employing a threshold of (m, n). CTP (NR) refers to the CTP corresponding to the non-rechargeable sensor case, i.e. when all the sensors are provided with all their energy quanta in advance (refer Section 6.3.4). The upper bound (UB) is given by Lemma 25, while the upper bound on TTP policies (UB: TTP) is given by Corollary 26. Simplifying (7.4) for the given system parameters gives, 4m + n = 20. 0.35 0.35 0.3 0.3 Time Average Utility Time Average Utility 133 0.25 CTP (NR) CTP TTP UB: TTP UB 0.2 0.15 0.1 0 2 4 6 8 10 Threshold (m) 0.25 0.2 CTP (NR) CTP TTP UB: TTP UB 0.15 12 14 0.1 0 16 2 4 6 8 10 Threshold (m) 12 14 16 (a) Independent recharge (pd = 0.1, n = 0) (b) Correlated recharge (pd = 0.1, n = 0) 0.9 0.9 0.8 Time Average Utility Time Average Utility 0.8 0.7 0.6 CTP (NR) CTP TTP UB: TTP UB 0.5 0.4 0.7 0.6 0.5 CTP (NR) CTP TTP UB: TTP UB 0.4 0.3 0.3 0 2 4 6 8 10 Threshold (m) 12 14 16 0.2 0 2 4 6 8 10 Threshold (m) 12 14 16 (a) Independent recharge (pd = 0.5, n = 4) (b) Correlated recharge (pd = 0.5, n = 4) Figure 7.1: Performance of Threshold Policies. TTP achieves nearoptimal performance. CTP achieves slightly better performance for low detection probability pd . The only integral solutions (m, n) to the above equation such that n ≤ m are given by (5, 0) and (4, 4). Threshold (5, 0) maximizes (7.6) for pd = 0.1, while threshold (4, 4) maximizes the same for pd = 0.5, as shown in Table 7.1. In all the scenarios, the peak performance of TTP is achieved at a threshold of m = N qc δ1 +π on δ2 = 4, in accordance with Lemma 27. Figures 7.1(a) and (b) have pd = 0.1 and hence the optimal threshold pair is given by (m, n) = (5, 0). Figures 7.1(c) and (d) have pd = 0.5, and hence the optimal threshold pair is given by (4, 4). Note that a CTP policy with a threshold of (4, 4) achieves the same performance as a TTP policy with a threshold of 4. Thus in the case with large pd , both the TTP 134 and CTP policies achieve near optimal performance, unlike the case with small pd . The peak performance of CTP is observed to be slightly less than the upper bound in the simulations. One of the reasons for this behaviour is the edge effects encountered while implementing the threshold policy. While the event process is in the On period, a threshold of m is targeted. However, as noted in Section 7.2.2 the threshold of m can only be employed (put to practice) for T2 − 1 time slots, when the On period consists of T2 time slots. Similar effect applies during each Off period with a targeted threshold of n. The other reason for this difference in performance of the best threshold policy (given by the optimal solution to (7.6) with m, n integers), compared to the upper bound on performance, is that the optimal threshold parameters (m∗ , n∗ ) which maximize (7.6) may not be integers. One way to handle fractional threshold parameters is to use randomization while employing the threshold. For instance, if m = 4.5, choose m = 4 with probability 0.5; and m = 5 otherwise. When the sensor nodes have completely correlated recharge process, CTP achieves the same performance as CTP (NR) at the threshold (N, 0) (refer Figure 7.1(b)). This is because with correlated recharge, all the N sensors operating under the CTP policy become available simultaneously and get activated in a manner such that the number of active sensors in the system at any time is either 0 or N. Further the total number of time slots when a threshold of N is met in the system using CTP policy equals the number of such time slots for CTP (NR). This is because the system operating under CTP policy receives the same amount of expected energy during the course of its recharge, as the amount of energy available in advance to the CTP (NR). (The performance of CTP and CTP (NR) at the threshold of m = N is different in Figure 7.1(d), since the performance plotted corresponds to a threshold pair of (m, 4) and not to the pair (m, 0).) We observe that these threshold based activation schemes are robust not only to the temporal correlations in the event process, but also to the spatial correlations across the recharge processes at the different sensor nodes (as shown earlier in Chapters 3 and 5). And the optimal threshold based scheme performs well under both independent as well as correlated recharge scenarios. Spatial correlation in the 135 recharge, however, worsens system performance for all threshold based policies at higher threshold parameters. 7.3 Summary For a surveillance region covered using multiple sensors, we consider threshold based activation policies and evaluate their performance for different choices of (constant or time varying) thresholds under temporally correlated events. We show that a simple, energy-balancing time-invariant threshold policy (TTP) achieves nearoptimal performance in the presence of temporal correlations across events. We also show that a smarter choice of time-varying thresholds (CTP) may provide better performance under certain situations. CHAPTER 8 Concluding Remarks and Future Work We have considered a system of rechargeable sensor nodes deployed randomly and redundanly in the region in order to detect the occurrence of interesting phenomena over time. Due to limited energy replenishment at the sensors, energy-efficient operation of these sensors would involve making activation and deactivation decisions over time in such systems. We have addressed this important decision question, namely, the node activation question − How should the individual sensors be activated over time in order to maximize a generalized global system performance metric ? We have expressed the performance metric in the form of time-average utility, where the utility function is expressed in terms of the event detection probability or other similar measure. In the case of sufficient redundacy, the utility function is assumed to be concave, i.e., the utility achieved in a region is assumed to exhibit diminishing returns with respect to the number of active sensors in the region. We have considered the random nature of discharge and recharge processes across the sensor nodes, and spatial and temporal correlations in the event occurrence phenomena, in the process of designing efficient node activation policies. We have performed an indepth study of near-optimal activation policies in a stochastic decision framework, with the goal of designing efficient activation policies which are easy to implement and deploy at sensor nodes in a distributed manner. We have considered two different models of sensor energy consumption. In the Full Activation sensor system model, a sensor gets recharged only when it has completely discharged, i.e., is no longer operational. Also the sensor is recharged until it is completely recharged, at which time it is considered available for activation. In the Partial Activation sensor system model, the recharge process is modeled as a continuous random process which acts upon a sensor at all times, and a sensor is considered available for activation as long as it has sufficient energy to detect events. We have also considered the fact that the event occurrence phenomena, which the sensors are expected to detect, may exhibit significant degree of correlations in 136 137 space and/or over time. Since event detection (and subsequent actions like transmission) is the major cause of energy consumption at the sensors, these correlations result in similar correlations across the discharge processes of active sensors. In addition, the recharge processes across the different spatially co-located sensor nodes may exhibit significant correlations. We have considered the node activation question in the presence of spatial correlations for the full activation sensor system model in Chapters 3 and 4, and for the partial activation sensor system model in Chapter 5. We studied the performance of simple threshold activation policies, and showed analytically that for the case with identical sensor coverage, the energy-balancing threshold policy attains performance − (a) At least 3 4 of the optimum, under full K of the optimum, under activation sensor system model, and (b) At least O K+1 partial activation sensor system model. Here, K is the sensor energy bucket size. We showed how the threshold policy can be implemented in a distributed manner in the general network scenario, and showed using simulations that the distributed version of energy-balancing threshold policies achieve near-optimal performance for various system parameters in both the above sensor system models. Our results show that spatial correlations in the discharge and recharge times (or processes) worsen system performance, and the energy-balancing threshold activation policy is robust to such correlations. Next, we have considered temporal correlations across event occurrences under partial activation sensor system model, and addressed the node activation question from the perspective of a single rechargeable sensor node in Chapter 6. Under partial observability, the sensor should deactivate during Off periods, and employ an appropriate sleep duration derived using energy balance. We show that such an activation policy (EB-CW) achieves optimal performance under most practical scenarios. For the case where multiple rechargeable sensors cover the region of interest, we have considered threshold based activation policies and evaluated their performance in Chapter 7. We show that a simple, energy-balancing time-invariant threshold policy (TTP) achieves near-optimal performance in most practical cases. In addition, a smarter choice of time-varying thresholds (CTP) may provide better performance under certain situations. 138 8.1 Future Directions In future, this thesis can be expanded in multiple directions, some of which are discussed below. Performance Bounds for General Network Scenario: For the partial-coverage overlap case, where the sensors cover different regions in the physical space, we have developed distributed node activation algorithms and evaluated their performance through simulations. It might also be possible to analyze this general scenario and derive tight performance bounds on the performance of threshold based, or similar algorithms, in this general network scenario. Under Markovian assumptions, the state space of the sensor system in the general network becomes huge and unstructured, which renders it difficult to express the performance in terms of closed-form expressions. Considering deterministic recharge and discharge (life)times might provide an ease of analysis. As we have seen through simulation results, the performance trend with deterministic times are similar to that with random correlated times, and randomness only results in some additional error factor in the performance. Therefore, consideration of deterministic recharge and discharge times to analyze general network scenario might result in policies which are suitable to the random scenario as well. A different approach could include developing activation and sleep schedules for each sensor in a distributed manner. In this approach, all the sensors compute a schedule of their activation and sleep times, during the initialization phase. Thereafter, all the sensors follow their respective fixed schedules in a repetitive manner. Note here that once the sensors have decided upon their schedules during initialization, no further coordination is necessary between the sensor nodes. Hence, the need for a sensor node to wake up periodically to check the status of their network neighbourhood is avoided. To design such schedules during the initialization phase so as to provide optimal time-average performance may turn out to be a computationally prohibitive task. The research challenges would include designing simplistic schedules which guarantee near-optimal performance. Connectivity Constraints: Our performance metric only reflects the quality of coverage provided in the region. Another interesting perspective is to include the connectivity criteria in the network performance, in addition to that of coverage. 139 Thus the utility function may be extended to include not only the event detection probability but also the probability that the detected event gets reported successfully to a centralized monitoring station. The latter probability could be expressed in terms of the existence of an active path (consisting of sensors in active state only) from the sensor to the monitoring station. Exploring the structure of the optimal activation policy under such formulations using a stochastic decision framework remains to be an open question. Time-variant Utility: We have considered utility functions which are independent of time. For temporal correlations scenarios considered in Chapters 6 and 7, we have considered detection of individual events, however, for the spatial correlations scenarios considered in Chapters 3, 4, and 5, we have not considered the detection of individual events. Therefore, for the spatial correlations scenarios, activating a particular number of sensors during the Off period provides the same utility as activating these sensors during the On period. In practice, however, the performance depends on detecting (and reporting) individual events, and thus the utility provided during the On periods should only be considered in the performance criteria. This and other similar performance constraints can be modeled in the rechargeable sensor system model using a time-variant utility function based metric. Such a system model would be strong enough to model complicated spatio-temporally correlated rechargeable sensor systems. LITERATURE CITED [1] M. Adamou, and S. Sarkar, A Framework for Optimal Battery Management for Wireless Nodes, Proc. IEEE INFOCOM, June 2002. [2] M. Adamou, and S. Sarkar, Computationally Simple Battery Management Techniques for Wireless Nodes, Invited Paper, Proc. European Wireless conference, February 2002. [3] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, Wireless Sensor Networks: A Survey, Computer Networks 38(4):393-422, March 2002. [4] I. F. Akyildiz, M. C. Vuran, and O. B. Akan, On Exploiting Spatial and Temporal Correlation in Wireless Sensor Networks, Proc. Second Intl. Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks (WiOpt), pp. 71-80, March 2004. [5] N. Bambos, and S. Kandukuri, Power Controlled Muliple Access (PCMA) in Wireless Communication Networks, Proc. IEEE INFOCOM, March 2000. [6] T. Banerjee, S. Padhy, and A. A. Kherani, Optimal Dynamic Activation Policies in Sensor Networks, Proc. 2nd International Conference on Communication Systems Software and Middleware (COMSWARE), January 2007. [7] D. P. Bertsekas, Dynamic Programming and Optimal Control, Volume I, Athena Scientific, Belmont, Massachusetts, 2000. [8] D. P. Bertsekas, Dynamic Programming and Optimal Control, Volume II, Athena Scientific, Belmont, Massachusetts, 2001. [9] U. N. Bhat, Elements of Applied Stochastic Processes, 2nd Ed., John Wiley, NY, 1984. [10] D. Blough, and P. Santi, Investigating Upper Bounds on Network Lifetime Extension for Cell-Based Energy Conservation Techniques in Stationary Ad Hoc Networks, Proc. ACM MOBICOM, pp. 183-192, September 2002. [11] V. S. Borkar, A. A. Kherani, and B. J. Prabhu, Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device, Proc. 3rd International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), April 2005. [12] Joel M. Calabrese, Optimal Workload Allocation in Open Networks of Multiserver Queues, Management Science, 38(12):1792-1802, December 1992. 140 141 [13] M. Cardei, and D. Z. Du, Improving Wireless Sensor Network Lifetime through Power Aware Organization, ACM Wireless Networks, 11(3), 2005. [14] M. Cardei, D. MacCallum, X. Cheng, M. Min, X. Jia, D. Li, and D. Z. Du, Wireless Sensor Networks with Energy Efficient Organization, Journal of Interconnection Networks, 3(3-4):213-229, December 2002. [15] M. Cardei, M. Thai, Y. Li, and W. Wu, Energy-Efficient Target Coverage in Wireless Sensor Networks, Proc. IEEE INFOCOM, March 2005. [16] A. R. Cassandra, L. P. Kaelbling, and M. L. Littman, Acting Optimally in Partially Observable Stochastic Domains, Proc. 12th National Conference on Artificial Intelligence (AAAI-94), 2:1023-1028, 1994. [17] J. Chang, and L. Tassiulas, Energy Conserving Routing in Wireless Ad-hoc Networks, Proc. IEEE INFOCOM, March 2000. [18] J. H. Chang, and L. Tassiulas, Fast Approximation Algorithms for Maximum Lifetime Routing in Wireless Ad-hoc Networks, Proc. Networking 2000, May 2000. [19] B. Chen, K. Jamieson, H. Balakrishnan, and R. Morris, Span: An Energy Efficient Coordination Algorithm for Topology Maintenance in Ad Hoc Wireless Networks, Proc. ACM MOBICOM, pp. 85-96, July 2001. [20] C. Chiasserini, and R. Rao, Energy Efficient Battery Management, Proc. IEEE INFOCOM, March 2000. [21] I. Chlamtac, C. Petrioli, and J. Redi, Energy-conserving Access Protocols for Identification Networks, IEEE/ACM Transactions on Networking, 7(1):51-59, February 1999. [22] A. Chokalingam, and M. Zorzi, Energy Efficiency of Media Access Protocols for Mobile Data Networks, IEEE Transactions on Communications, 46(11):1418-1421, November 1998. [23] Y. Dallery, and K. E. Stecke, On the Optimal Allocation of Servers and Workloads in Closed Queuing Networks, Operations Research, 38(4):694-703, 1990. [24] K. Dasgupta, M. Kukreja, and K. Kalpakis, Topology-Aware Placement and Role Assignment for Energy-Efficient Information Gathering in Sensor Networks, Proc. Eighth IEEE International Symposium on Computers and Communication, 1:341-348, 2003. [25] D. Ganesan, R. Govindan, S. Shenker, and D. Estrin, Highly Resilient, Energy-Efficient Multipath Routing in Wireless Sensor Networks, Mobile Computing and Communications Review (MC2R), 1(2), 2002. 142 [26] M. Gastpar, and M. Vitterli, Source-channel Communication in Sensor Networks, Proc. 2nd Intl. Workshop on Information Processing in Sensor Networks (IPSN), pp. 162-177, April 2003. [27] E. F-Gaucherand, A. Arapostathis, and S. I. Marcus, On the Average Cost Optimality Equation and the Structure of Optimal Policies for Partially Observable Markov Decision Processes, Annals of Operations Research, 29(1-4):439-470, April 1991. [28] D. Gross, and C.M. Harris, Fundamentals of Queuing Theory, Wiley Series in Probability and Statistics, NY, 1998. [29] H. Gupta, S. Das, and Q. Gu, Connected Sensor Cover: Self-Organization of Sensor Networks for Efficient Query Execution, Proc. ACM MOBIHOC, 2003. [30] W. Heinzelman, A. Chandrakasan, and H. Balakrishnan, Energy-efficient Routing Protocols for Wireless Microsensor Networks, Proc. HICSS, January 2000. [31] W. Heinzelman, J. Kulik, and H. Balakrishnan, Adaptive Protocols for Information Dissemination in Wireless Sensor Networks, Proc. ACM MOBICOM, August 1999. [32] N. Jaggi, Robust Threshold based Sensor Activation Policies under Spatial Correlation, Proc. 4th Intl. Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1-8, April 2006. [33] N. Jaggi, and A. A. Abouzeid, Energy-Efficient Connected Coverage in Wireless Sensor Networks, Proc. 4th Asian International Mobile Computing Conference (AMOC), pp. 77-86, January 2006. [34] N. Jaggi, K. Kar, and A. Krishnamurthy, Rechargeable Sensor Activation under Temporally Correlated Events, to appear in Proc. 5th Intl. Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), April 2007. [35] N. Jaggi, K. Kar, and A. Krishnamurthy, Dynamic Sleep Scheduling in Rechargeable Sensor Networks, Book Chapter, to appear in Handbook of Wireless Mesh and Sensor Networks, 2007. [36] N. Jaggi, K. Kar, and A. Krishnamurthy, Near-Optimal Activation Policies in Rechargeable Sensor Networks under Spatial Correlations, ACM Transactions on Sensor Networks (under Review). [37] N. Jaggi, K. Kar, and A. Krishnamurthy, Rechargeable Sensor Activation under Temporally Correlated Events, IEEE Transactions on Automatic Control (submitted for publication). 143 [38] N. Jaggi, A. Krishnamurthy, and K. Kar, Utility Maximizing Node Activation Policies in Networks of Partially Rechargeable Sensors, Proc. 39th Annual Conference on Information Sciences and Systems (CISS), March 2005. [39] E. Jung, and N. Vaidya, An Energy Efficient MAC Protocol for Wireless LANs, Proc. IEEE INFOCOM, June 2002. [40] A. Kansal, and M. B. Srivastava, Energy Harvesting Aware Power Management, Book Chapter, Wireless Sensor Networks: A Systems Perspective, Artech House, Norwood, MA, July 2005. [41] K. Kar, and S. Banerjee, Node Placement for Connected Coverage in Sensor Networks, Proc. Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), 2003. [42] K. Kar, M. Kodialam, T. V. Lakshman, and L. Tassiulas, Online Routing in Energy-constrained Ad-hoc Networks, Proc. IEEE INFOCOM, April 2003. [43] K. Kar, A. Krishnamurthy, and N. Jaggi, Dynamic Node Activation in Networks of Rechargeable Sensors, Proc. IEEE INFOCOM, pp. 1997-2007, March 2005. [44] K. Kar, A. Krishnamurthy, and N. Jaggi, Dynamic Node Activation in Networks of Rechargeable Sensors, IEEE/ACM Transactions on Networking, 14(1):15-26, February 2006. [45] K. Kar, and A. Krishnamurthy, Threshold Activation Policies in a Random Sensing Environment, Technical Report, Rensselaer Polytechnic Institute, July 2004. (www.ecse.rpi.edu/∼ koushik/sensor-threshold-techrep.pdf) [46] R. Kravets, and P. Krishnan, Power management techniques for mobile communications, Proc. ACM MOBICOM, pp. 157-168, October 1998. [47] B. Krishnamachari, Y. Mourtada, and S. Wicker, The Energy-Robustness Tradeoff for Routing in Wireless Sensor Networks, Proc. IEEE International Conference on Communications (ICC), May 2003. [48] S. S. Lavenberg, The Steady State Queuing Time Distribution for the M/G/1 Finite Capacity Queue, Management Science, 21(5):501-506, 1975. [49] Q. Li, J. Aslam, and D. Rus, Online Power-aware Routing in Wireless Ad-hoc Networks, Proc. ACM MOBICOM, July 2001. [50] L. Lin, N. Shroff, and R. Srikant, Asymptotically Optimal Power-Aware Routing for Multihop Wireless Networks with Renewable Energy Sources, Proc. IEEE INFOCOM, March 2005. 144 [51] M. L. Littman, Memoryless policies: Theoretical limitations and practical results, From Animals to Animats 3: Proc. 3rd Intl. Conference on Simulation of Adaptive Behavior, pp. 238-245, 1994. [52] David R. Manfield, and P. Tran-Gia, Analysis of a Finite Storage System with Batch Input Arising out of Message Packetization, IEEE Transactions on Communications, 30(3):456-463, March 1982. [53] S. Meguerdichian, F. Koushanfar, M. Potkonjak, and M. Srivastava, Coverage Problems in Wireless Ad-Hoc Sensor Network, Proc. IEEE INFOCOM, April 2001. [54] M. Papadopouli, and H. Schulzrinne, Effects of power conservation, wireless coverage and cooperation on data dissemination among mobile devices, Proc. ACM MOBIHOC, pp. 117-127, October 2001. [55] A. Papoulis, and S. U. Pillai, Probability, Random Variables and Stochastic Processes, McGraw Hill, 2002. [56] S. Pattem, B. Krishnamachari, and R. Govindan, The Impact of Spatial Correlation on Routing with Compression in Wireless Sensor Networks, Proc. 3rd Intl. Symposium on Information Processing in Sensor Networks (IPSN), pp. 28-35, April 2004. [57] Martin L. Puterman, Markov Decision Processes - Discrete Stochastic Dynamic Programming, John Wiley and Sons, NY, 1994. [58] B. Prabhakar, E. Uysal, and A. El Gamal, Energy-efficient Transmission over a Wireless Link via Lazy Packet Scheduling, Proc. IEEE INFOCOM, April 2001. [59] C. S. Raghavendra, and S. Singh, PAMAS (Power Aware Multi-Access Protocol) with Signaling for Ad hoc Networks, Computer Communications Review, July 1998. [60] V. Raghunathan, A. Kansal, J. Hsu, J. Friedman, and Mani Srivastava, Design Considerations for Solar Energy Harvesting Wireless Embedded Systems, Proc. 4th Intl. Conference on Information Processing in Sensor Networks (IPSN) - Special Track on Platform Tools and Design Methods for Network Embedded Sensors (SPOTS), pp. 457-462, April 2005. [61] V. Rajendran, J. J. Garcia-Luna-Aceves, and K. Obraczka, An Energy-Efficient Channel Access Scheduling for Sensor Networks, Proc. Fifth International Symposium on Wireless Personal Multimedia Communication (WPMC), October 2002. [62] Y. Sagduyu, and A. Ephremides, Energy-efficient Collision Resolution in Wireless Ad-hoc Networks, Proc. IEEE INFOCOM, April 2003. 145 [63] S. Sahni, and X. Xu, Algorithms for Wireless Sensor Networks, Intl. Journal on Distributed Sensor Networks, Invited Paper, Preview Issue, 2004, 35-56. [64] M. Shaked, and J. G. Shanthikumar, Stochastic Orders and their Applications, Academic Press, NY, 1994. [65] S. Shakkottai, R. Srikant, and N. Shroff, Unreliable Sensor Grids: Coverage, Connectivity and Diameter, Proc. IEEE INFOCOM, April 2003. [66] J. G. Shanthikumar, and D. D. Yao, On Server Allocation in Multiple Center Manufacturing Systems, Operations Research, 36(2):333-342, 1988. [67] T. Shepherd, A Channel Access Scheme for Large Dense Packet Radio Networks, Proc. ACM SIGCOMM, 1996. [68] S. Singh, and C. S. Raghavendra, PAMAS: Power Aware Multi-Access protocol with Signalling for Ad Hoc Networks, ACM Computer Communication Review, pp. 5-26, July 1998. [69] S. Singh, M. Woo, and C. S. Raghavendra, Power Aware Routing in Mobile Ad-hoc Networks, Proc. ACM MOBICOM, pp. 181-190, October 1998. [70] S. Slijepcevic, and M. Potkonjak, Power Efficient Organization of Wireless Sensor Networks, IEEE International Conference on Communications (ICC), 2:472-476, June 2001. [71] D. R. Smith, and W. Whitt, Resource Sharing for Efficiency in Traffic Systems, The Bell System Technical Journal, 60(1), January 1981. [72] V. Srinivasan, P. Nagehalli, C. Chiasserini, and R. Rao, Energy Efficiency of Ad hoc Wireless Networks with Selfish Users, Proc. European Wireless conference, February 2002. [73] G. Theocharous, S. Mannor, N. Shah, P. Gandhi, B. Kveton, S. Siddiqi, and C-H. Yu, Machine Learning for Adaptive Power Management, Intel Technology Journal, Vol. 10, Issue 4, 2006. [74] M. C. Vuran, O. B. Akan, and I. F. Akyildiz, Spatio-Temporal Correlation: Theory and Applications for Wireless Sensor Networks, Elsevier Computer Networks, 45(3):245-261, June 2004. [75] X. Wang, G. Xing, Y. Zhang, C. Lu, R. Pless, and C. Gill, Integrated Coverage and Connectivity Configuration in Wireless Sensor Networks, Proc. First ACM Conference on Embedded Networked Sensor Systems (SenSys), November 2003. [76] R. L. Wheeden, and A. Zygmund, Measure and Integral, An Introduction to Real Analysis, Marcel Dekker Inc., NY, 1977. 146 [77] Y. Xu, S. Bien, Y. Mori, J. Heidemann, and D. Estrin, Topology Control Protocols to Conserve Energy in Wireless Ad Hoc Networks, CENS Technical Report 0006, January 2003 (http://lecs.cs.ucla.edu/Publications/papers/gaf-cec-journal.pdf). [78] Y. Xu, J. Heidemann, and D. Estrin, Adaptive Energy-Conserving Routing for Multihop Ad Hoc Networks, Research Report 527, USC/Information Sciences Institute, October 2000. [79] Y. Xu, J. Heidemann, and D. Estrin, Geography-informed Energy Conservation for Ad Hoc Routing, Proc. ACM MOBICOM, pp. 70-84, July 2001. [80] W. Ye, J. Heidemann, and D. Estrin, An Energy-Efficient MAC Protocol for Wireless Sensor Networks, Proc. IEEE INFOCOM, June 2002. [81] W. Ye, J. Heidemann, and D. Estrin, Medium Access Control with Coordinated, Adaptive Sleeping for Wireless Sensor Networks, IEEE/ACM Transactions on Networking, 12(3):493-506, June 2004. [82] Y. Yu, B. Krishnamachari, and V. K. Prasanna, Energy-Latency Tradeoffs for Data Gathering in Wireless Sensor Networks, Proc. IEEE INFOCOM, March 2004. [83] H. Zhang, and J. C. Hou, Maintaining Sensing Coverage and Connectivity in Large Sensor Networks, NSF International Workshop on Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless and Peer-to-Peer Networks, February 2004. [84] J. Zhao, R. Govindan, and D. Estrin, Residual Energy Scans for Monitoring Wireless Sensor Networks, Proc. IEEE Wireless Communications and Networking Conference (WCNC), March 2002. [85] Z. Zhou, S. Das, and H. Gupta, Connected K-Coverage Problem in Sensor Networks, Proc. International Conference on Computer Communications and Networks (ICCCN), 2004.