1.The Impact Of Data Aggregation in Wireless Sensor Networks. 2.The ACQUIRE Mechanism for Efficient Querying In Sensor Networks. By: Kinnary Jangla Rishi Kant Sharda Paper By: - Bhaskar Krishnamachari - Deborah Estrin - Stephen Wicker Presented By: - Kinnary Jangla - Rishi Kant Sharda Basic Idea.. To exploit the data redundancy Packets from different nodes, are combined in – network. Implementation Who carries the data with redundancy Data-centric routing Differences Data-centric routing Based on contents of the packets. Address-centric routing Routing based on an end-to-end manner. The Impact Of Data Aggregation On Wireless Sensor Networks Overview Sensor Network Models: Event-Radius Model Random Source Models Impact of: Source-Destination Placements Communication Network Density On : - Energy Costs - Delay The Impact Of Data Aggregation On Wireless Sensor Networks (Cont..) Data Centric routing - Significant Performance Gain Complexity of Data Aggregation NP-Hard Problem. The Impact Of Data Aggregation On Wireless Sensor Networks Sub - Titles: Introduction. Routing Models. AC DC Data-Aggregation Optimal – Suboptimal Aggregation Sensor Network Models Energy Savings Theoretical Results Simulation Results Delay ? Introduction. Concepts. Sensor Network ? Sensor Node ? Unattended Operation ? Data Aggregation ? Data Redundancy ! Wireless Sensor Network. Applications. Network Topology of a Sensor Network. ?? ? ? Introduction cont.. Network Topology of a Wireless Sensor Network. The Impact Of Data Aggregation On Wireless Sensor Networks (cont..) Data Aggregation in WSN ? - Address-centric approach - Data-centric approach The Impact Of Data Aggregation On Wireless Sensor Networks Routing Models Address Centric Approach The Impact Of Data Aggregation On Wireless Sensor Networks Data – Centric Approach The Impact Of Data Aggregation On Wireless Sensor Networks Data Aggregation Result 1: - The optimum number of transmissions required per datum for the DC protocol is equal to the number of edges in the minimum steiner tree in the network which contains the node set (s1, …. , Sk, D). - Hence, assuming an arbitrary placement of sources and a general network graph G, the task of doing DC routing with optimal data aggregation is NP-Hard. { - Steiner Tree? - NP-Hard Problem? } Optimal Data Aggregation The optimal data aggregation problem is NP-Hard. An optimal multicast problem A well-known problem A minimum Steiner tree problem: NPC So…NO optimal Solution Thus, sub-optimal solutions. The Impact Of Data Aggregation On Wireless Sensor Networks Data Aggregation Section 1: 3 – suboptimal Schemes: Center at Nearest Source: Shortest Paths Tree Aggregation center: nearest node to the sink. Shortest path routing with data aggregation in the overlap nodes. Greedy Incremental Tree Node closest to the tree connects to the path and forms a new tree until all the source nodes are vertices. The Impact Of Data Aggregation On Wireless Sensor Networks (cont..) Section 2: Sensor Network Models:- for source placement. 1. { Factors affecting the performance gains of sensor network.. Position of the sources communication network topology.} 2. Event Radius Model. Random Sources Model. The Impact Of Data Aggregation On Wireless Sensor Networks Event Radius Model. Location of an event. Sensing Range, S. (Pi)*S^2*n – average number of sources. The Impact Of Data Aggregation On Wireless Sensor Networks Random Sources Model. Sources not clustered. K random nodes, that are not sinks,are chosen to be sources Energy Savings due to data aggregation Notations: di : the distance of the shortest path from source i to the sink NA: the total number of transmissions required for the optimal address-centric protocol ND: the total number of transmissions required for the optimal data-centric protocol X: the diameter of the graph formed by a set of connected nodes K: the number of the sources in the RS model R: communication range S: sensing range in the ER model The Impact Of Data Aggregation On Wireless Sensor Networks Energy Savings Due to Data Aggregation Main performance gain When sources are far away from the sink. NA = d1 + d2 + …. Dk = sum (di) Diameter X = max of pairwise shortest paths. Theoretical Results: Result 2: If the source nodes S1, S2, … , Sk have a diameter X >= 1. The total number of transmissions (Nd) required for the optimal DC protocol satisfies the following bounds: ND < = (k-1)X + min(di) …… ND >= min(di) + (k-1) …… X >= 1 X=1 Corollary If diameter X < min(di), then ND < NA. Proof: data aggregation tree consists of (k − 1) sources sending their packets to the remaining source which is nearest to the sink. This tree has no more than (k−1)X +min(di) edges, Next result is obtained by considering the smallest possible Steiner tree which would happen if the diameter were 1. The shortest path from the source node at min(di) must be part of the minimum Steiner tree, and there is exactly one edge from each of the other source nodes to this node. Conclusion: The optimum data-centric protocol will perform strictly better than the Address-centric protocol. Cont… Result 3: ND/NA = 1/k - DC Protocol gives k-fold savings. The Impact Of Data Aggregation On Wireless Sensor Networks Cont… Result 4: If the subgraph G” of the communication graph G induced by the set of source nodes (S1……Sk) is connected, the optimal data aggregation tree can be formed in polynomial time. Corollary: In the ER model, when R > 2S, the optimal data aggregation tree can be formed in polynomial time. Proof: The tree is initialized with the path from the sink to the nearest source. At each additional step of the GIT, the next source to be connected to the tree is always exactly one step away (such a source is guaranteed to exist since G is connected). At the end of the construction, the number of edges in the tree is therefore dmin + (k − 1). Therefore, the GIT construction runs in polynomial time w.r.t. the number of nodes . Summary: Result 1: The number of transmissions for the DC protocol = number of edges in the minimum Steiner tree. Result 2: Nd <= (k-1)X + min(di) Nd >= (k-1) + min(di) Result 3: ND/NA = 1/k Result 4: The optimal data aggregation tree can be formed in polynomial time. The Impact Of Data Aggregation On Wireless Sensor Networks Simulation Results: Figure 1: - Comparison of Energy costs versus R in the ER model. Figure 2: - Comparison of energy costs versus R in the RS model The Impact Of Data Aggregation On Wireless Sensor Networks Figure 3: Comparison of energy costs versus S in the ER model Sensing Range Figure 4: - Comparison of energy costs versus k in the RS model. The Impact Of Data Aggregation On Wireless Sensor Networks Energy Savings. Summary of experiments: Energy Savings due to data aggregation can be quite significant, particularly when there are a lot of sources – (large S or large k) that are many hops from the sink - (small R). The Impact Of Data Aggregation On Wireless Sensor Networks Delay due to Data Aggregation Tradeoff: Greater Delay !! Data from sources have to be held back at an intermediate node in order to be aggregated. Worst Case:- Latency due to aggregation will be proportional to the number of hops between sink and the farthest source. The Impact Of Data Aggregation On Wireless Sensor Networks Figure 5: Max(di) and Min(di) versus R in the ER Model Figure 6: Max(di) and Min(di) versus S in the ER Model. The Impact Of Data Aggregation On Wireless Sensor Networks Conclusions: The formation of an optimal data aggregation tree is NP – Hard. Energy Gains possible with data aggregation. Large when - number of sources large - Sources located close to each. Other and far from sink Aggregation Latency (Delay) non-negligible The ACQUIRE Mechanism for Efficient Querying in Sensor Networks Written By: Narayanan Sadagopan Bhaskar Krishnamachari Ahmed Helmy Presented By: Rishi Kant Sharda Kinnary Jangla The Basics A sensor network is a computer network of many, spatially distributed devices using sensors to monitor conditions at different locations, such as temperature, sound, vibration, pressure, motion or pollutants. Each device is equipped with a radio transceiver, a small microcontroller, and an energy source, usually a battery. The devices use each other to transport data to a monitoring computer. Usually these devices are small and inexpensive, so that they can be produced and deployed in large numbers, and so their resources in terms of energy, memory, computational speed and bandwidth are severely constrained. Therefore not feasible to collect all measurements from each device for centralized processing. Introduction Best to view them as distributed databases. Central querier/data sink issues queries. Due to energy constraints it is desirable for much of the data processing to be done innetwork. This leads to the concept of data centric information routing i.e. queries and responses are for named data. Categories of Queries Continuous Queries e.g Report the measured temperature for the next 7 days with a frequency of 1 measurement per hour. One-Shot Queries e.g Is the current temperature higher than 70°? Aggregate Queries e.g Report the calculated average temperature of all nodes in region X. Non-Aggregate Queries e.g What is the temperature measured by node x? Complex Queries e.g What are the values of the following variables: X, Y , Z? Simple Queries e.g What is the value of the variable X? Queries for Replicated data e.g Has a target been observed anywhere in the area? Queries for Unique data Flooding-based query mechanisms: (Directed Diffusion data-centric routing scheme) Expanding Ring Search Why ACQUIRE? Earlier Flooding-based query methods such as “Directed Diffusion data-centric routing scheme” are well suited only for continuous-aggregate queries. One-size-fits-all approach unlikely to provide efficient solutions for other types. If it is not continuous then flooding can dominate the costs associated with querying. Similarly in data aggregation duplicate responses can lead to suboptimal data collection in terms of energy costs. Example: Bird Habitat Monitoring Example: Continued Task: “Obtain sample calls for the following birds in the reserve: Blue jay, Nightingale, Cardinal, Warbler” Complex One-shot For replicated data ACQUIRE LEGEND Active Query Complete Response Update Messages Sensor Analysis of ACQUIRE Basic Model and Notation Local update Forward Steps to Query Completion Local Update Cost Total Energy Cost Optimal Look Ahead Basic Model and Notation X number of sensors. V = {V1,V2,…VN} are the N variables tracked. Q = {Q1,Q2,…QM} consisting of M sub-queries, 1 < M ≤ N and for all i : i < M, Qi Є V. Let SM be the average number of steps taken to resolve a query consisting of M sub-queries. d – Look ahead parameter Size of a sensors neighborhood f(d) Assumed that all queries Q are resolvable by this network. x* be the querier which issues the query Q. ACQUIRE Process Local Update : If current information not up-to-date, x sends request to all sensors d hops away. Request forwarded hop-by-hop. Sensors who get the request then forward their information to x. Let the energy consumed in this phase be Eupdate Forward : After answering the query based on information received. x forwards the remaining query to a randomly chosen node d hops away. ACQUIRE Process 2 Since updates are triggered only when the information is not fresh, it makes sense to try and quantify how often such updates will be triggered. We model this as amortization factor c. An update is likely to occur at any given node only once every c queries. c such that 0 < c ≤ 1. e.g if on average an update has to be done once every 100 queries, c = 0.01. α denotes the expected number of hops from the node where the query is completely resolved to x* ACQUIRE Process 3 The average energy consumed to answer the query of size M with look-ahead d can be expressed as: Case: d=D , where D is the diameter of the network. Case: d too small. SM ↓ when d ↑ Eupdate ↑ when d ↑ Steps to Query Completion If there are M queries to be resolved the probability of success in each trial is: p = M/N and failure is p = (N-M)/N. Expected number of trials till 1st success 1/p=N/M. The whole experiment can be repeated with one less query and time to answer another query is N/(M-1) and so on. Let σM be the number of trials till M successes i.e complete resolution. Then: Steps to Query Completion 2 H(M) is the sum of the first M terms of the harmonic series. H(M) ≈ ln(M) + γ, where γ = 0.57721 Euler’s constant, thus: and Local Update Cost Eupdate : Energy spent in updating the information at each active node. The number of transmissions needed to forward this request is the no. of nodes within d-1 hops, f(d-1). N(i) Number of nodes at hop i. Total Energy Cost If the response is returned along the reverse path i.e α <= dSM Special case: d = 0 –Random Walk. E(σM) steps to resolve and return the query. Optimal Look-ahead Ignoring boundary effects, it can be shown that N(i) = 4i and f(d) = (2d(d+1))+1 for a grid of sensors, each node having 4 immediate neighbors. Combining expression for SM, Eupdate, Eavg , N(i) and f(d) we get: Optimal Look-ahead 2 We determine the value of the look-ahead parameter which minimizes this energy cost by taking the derivative with respect to d and set it equal to 0, we get d* by: In general the lower c is, higher will be the look ahead parameter d* Optimal Look-ahead 4 Optimal Look-ahead 5 Average Energy per Query 4000 c=0.06 3500 3000 c=0.05 c=0.07 c=0.04 2500 2000 c=0.03 1500 c=0.02 1000 c=0.01 500 0 1 3 5 7 9 11 13 15 17 19 21 23 Look-ahead Parameter (d) [N=1000, M=200] 25 27 COMPARISON Conclusions Proposed ACQUIRE as a scalable protocol for complex, one-shot queries for replicated data in sensor networks. Developed an analytical comparison of ACQUIRE, FBQ and ERS. With optimal parameter settings ACQUIRE outperforms all other schemes for complex, oneshot queries. Optimal ACQUIRE performs many orders of magnitude better than flooding-based schemes. Can reduce energy consumption by more than 60%. Future Work The efficiency of ACQUIRE can also be improved if the neighborhoods of the successive active nodes in the query trajectory have minimal overlap. Guided trajectories may also be helpful in dealing with non-uniform data distributions Taking into account that receptions can also influence energy consumption. This is the case especially for broadcast messages. THANK YOU