Data Dissemination and Fusion in Sensor Networks The Need for Data Dissemination and Fusion – Energy efficiency is an essential factor; therefore, short-range hop-byhop communication is preferred over direct long-range communication to the destination – Since sensor network contains large amount of data for the end user, methods of combining or aggregating data into small set of information is necessary and contributes to energy savings – Data aggregation (aka data fusion) can combine unreliable data readings to produce accurate signal by improving the common signal and reducing the noise Taxonomy of Data Delivery Models in Wireless Sensor Networks – Wireless sensor networks are classified according to their data delivery model into the following categories [Kulik+ 2002]: 1. 2. Continuous o LEACH [Heinzelman+ 2000, 2002] is designed for routing data to base stations in static wireless sensor networks o TEEN (Threshold sensitive Energy Efficient sensor Network Protocol) [Agrawal+ 2001] and PEGASIS (Power Efficient GAthering in Sensor Information Systems) [Lindsey+ 2001] are both proposed as improvements to LEACH Observer-initiated o In Directed Diffusion [Intanagonwiwat + 2000], data are named using attribute-value pairs and sensed information in the network can be associated with such a pair. The sensor nodes send queries expressing their interest for sensed information satisfying a specific criteria Taxonomy of Data Delivery Models in Wireless Sensor Networks 3. Event-driven o 4. SPIN (Sensor network Protocols via Information Negotiation) [Kulik+ 2002] are set of protocols designed to disseminate data to all nodes in the network Hybrid o The above three approaches can coexist in the same network Directed Diffusion [Intanagonwiwat+ 2000] – Motivated by scaling, robustness and energy efficiency requirements – Directed diffusion is data-centric in that all communication is for named data – Data generated by sensor nodes is named using attribute-value pairs – All nodes in the network are application-aware – A node requests data by sending interests for named data – A sensing task is disseminated via sequence of local interactions throughout the sensor network as an interest for named data – Nodes diffusing the interest sets up their own caches and gradients within the network to which channel the delivery of data – During the data transmission, reinforcement and negative reinforcement are used to converge to efficient distribution – Intermediate nodes fuse interests, aggregate, correlate or cache data Directed Diffusion [Intanagonwiwat+ 2000] – Assumes that sensor networks are task-specific – the task types are known at the time the sensor network is deployed – An essential feature of directed diffusion is that interest, data propagation and data aggregation are determined by local interactions – Focused on design of dissemination protocols for tasks and events Naming – Task descriptions are named (specifies an interest for data matching the list of attribute-value pairs) and also called as interest – Example task: “Every I ms, for the next T seconds, send me a location of any four-legged animal in subregion R of the sensor field.” task = four-legged animal // detect animal location interval = 20 ms // send back events every 20 ms duration = 10 seconds // … for the next 10 seconds rect = [-100, 100, 200, 400] // from sensors within rectangle Directed Diffusion [Intanagonwiwat+ 2000] Naming – A sensor detecting an animal may generate the following data: task = four-legged animal // type of animal seen instance = horse // instance of this type location = [150, 200] // node location intensity = 0.5 // signal amplitude measure confidence = 0.85 // confidence in the match timestamp = 01:30:45 // event generation time Interests and Gradients – Interest is generally given by the sink node – For each active task, sink periodically broadcasts an interest message to each of its neighbors (including rect and duration attributes) – Sink periodically refreshes each interest by re-sending the same interest with monotonically increasing timestamp attribute for reliability purposes Directed Diffusion [Intanagonwiwat+ 2000] Interests and Gradients – Every node maintains an interest cache where each item in the cache corresponds to a distinct interest (different type, interval attributes with disjoint rect attributes) – Interest entries in the cache do not contain information about the sink – In some cases, definition of distinct interests allows interest aggregation – The interest entry contains several gradient fields, up to one per neighbor – When a node receives an interest, it determines if the interest exists in the cache 1. 2. If no matching exist, the node creates an interest entry This entry has single gradient towards the neighbor from which the interest was received with specified data rate Individual neighbors can be distinguished by locally unique identifiers If the interest entry exists, but no gradient for the sender of interest Node adds a gradient with the specified value Updates the entry’s timestamp and duration fields Directed Diffusion [Intanagonwiwat+ 2000] Interests and Gradients 3. If there exists both entry and a gradient, The node updates the entry’s timestamp and duration fields – When a gradient expires, it is removed from its interest entry – When all gradients for an interest entry have expired, the interest entry is removed from the cache – After receiving an interest, a node may re-send the interest to subset of its neighbors – To the neighbors, it may seem that interest originated from the sending node even though it may have been generated a distant sink. This represents a local interaction – This way, interest diffuse throughout the network and not each interest have been sent to all the neighbors if a node sent matching interest recently – Gradient specifies data rate (value) and a direction in directed diffusion, whereas the values can be used to probabilistically forward data in different paths in other sensor networks Directed Diffusion [Intanagonwiwat+ 2000] Data propagation – Data message is unicast individually to the relevant neighbors – A node receiving a data message from its neighbors checks to see if matching interest entry in its cache exists according the matching rules described 1. If no match exist, the data message is dropped 2. If match exists, the node checks its data cache associated with the matching interest entry If a received data message has a matching data cache entry, the data message is dropped Otherwise, the received message is added to the data cache and the data message is re-sent to the neighbors – Data cache keeps track of the recently seen data items, preventing loops – By checking the data cache, a node can determine the data rate of the received events Directed Diffusion [Intanagonwiwat+ 2000] Reinforcement – After the sink starts receiving low data rate events, it reinforces one neighbor in order to “draw down” higher quality (higher data rate) events – This is achieved by data driven local rules – To enforce a neighbor, the sink may re-send the original interest with higher data rate – When the data rate is higher than before, the node node must also reinforce at least one neighbor – Reinforcement can be carried out from neighbors to other neighbors in a particular path (i.e., when a path delivers an event faster than others, sink attempts to use this path to draw down high quality data) – In summary, reinforce one path, or part of it, based on observed losses, delay variances, and so on – Negative reinforce certain paths because resource levels are low Directed Diffusion [Intanagonwiwat+ 2000] [Figure adapted from Intanagonwiwat+ 2000] Directed Diffusion [Intanagonwiwat+ 2000] Advantages: – Data-centric dissemination – Robust multi-path delivery – Reinforcement-based adaptation to the empirically best network path – Energy savings with in-network data aggregation and caching – Gives designers the freedom to attach different semantics to gradient values – Reinforcement can be triggered not only by sources but also by intermediate nodes Directed Diffusion [Intanagonwiwat+ 2000] Disadvantages: – It may consume memory since all the attribute list is being sent Suggestions/Improvements/Future Work: – Exploration of possible naming schemes Negotiation-Based Protocols for Disseminating Information in Wireless Sensor Networks (SPIN Protocols) [Kulik+ 2002] – SPIN (Sensor Protocols for Information via Negotiation) is a family of negotiation-based information dissemination protocols which is designed to address the deficiencies of classic flooding by negotiation and resourceadaptation – SPIN disseminates each sensor readings to all sensors in the network, treating all sensors as potential sink nodes – Nodes using SPIN protocols names their data using high-level data descriptors, called meta-data and usage of meta-data negotiations eliminate transmission of redundant data in the network – Communication decisions can be based upon both application-specific knowledge of the data and knowledge of the resources available to nodes SPIN [Kulik+ 2002] – – SPIN has two basic ideas: Operate efficiently and conserve energy: communicate with each other about the sensor data received already and the data needed still Monitor and adapt changes in their own energy resources: extend the lifetime of the system Four difference SPIN protocols: SPIN-PP SPIN-EC SPIN-BC SPIN-RL Meta Data – Used to uniquely and completely describe the data being collected by sensors – If two pieces of actual data are distinguishable, then their meta-data should also be distinguishable – Since the format of meta-data is application-specific, each application needs to interpret and synthesize its own meta-data SPIN [Kulik+ 2002] Meta Data – SPIN applications must define a meta-data format for representing data that concerns with the costs of storing, retrieving and managing the meta-data – SPIN nodes uses three types of communication messages: – ADV (new data advertisement) REQ (request for data) DATA (data message) ADV and REQ messages contain only meta-data that is smaller than the DATA message SPIN Resource Management – SPIN applications are resource-aware and resource-adaptive – By knowing the resources at hand, the nodes makes informed decisions about using their resources effectively – SPIN specifies an interface that applications can use to find out their available resources rather than specifying a specific energy management protocols SPIN [Kulik+ 2002] The Problem – In conventional classic flooding, the source nodes sends data to all its neighbors and the neighbors check their record of already sent data to see if they have forwarded the data to their neighbors. If not, they forward the data and update the record – This requires small amount of protocol state at any node, disseminates data quickly in the network where neither the bandwidth is scarce and the links are error prone – The problems include: implosion, overlap and resource blindness Implosion: A node always sends data to its neighbors without being concerned about if the same data has been received by the neighbors from other nodes Overlap: The nodes waste energy and bandwidth by sending the overlapping data Resource Blindness: Nodes do not make decisions based on the energy available SPIN [Kulik+ 2002] The Solution – SPIN provides solution to the problems of implosion and overlap by negotiating with each other before transmitting data eliminates the transmission of redundant data – Nodes poll their resources before transmitting or processing data by probing the resource manager which keeps track of the resource consumption – Nodes can make efficient decisions based on the available energy level – The use of meta-data descriptors eliminates the possibility of overlap since the nodes can name the part of the data the nodes are interested in receiving – Resource-awareness of local resources allow sensors to make meaningful decisions to extend longevity SPIN [Kulik+ 2002] SPIN Protocols 1. SPIN-PP: A Three–stage handshake protocol for point-to-point media – This protocol works in three stages (ADV-REQ-DATA) with each stage corresponding to one of the messages – The node sends ADV message to its neighbors – Neighbors check to see if they already have received or requested this data – If not, the neighbors respond by sending REQ message to the sender – The sender responds to the REQ message sent by sending the actual DATA to the neighbors requesting the data – If the neighbor already has the advertised data, it does not send any message – Simplicity is the main strength, meaning that nodes make simple decisions, resulting in usage of small energy in computation – Each node only needs to know about its one hop neighbors SPIN [Kulik+ 2002] SPIN Protocols 2. SPIN-EC: SPIN-PP with low-energy threshold – Adds simple energy-conservation heuristic to the SPIN-PP protocol – When energy is abundant, SPIN-EC acts as SPIN-PP protocol – Whenever energy comes close to low-energy threshold, it adapts by reducing its participation – The node will only participate in the full protocol if it believes that it has enough energy to complete the protocol without reaching below the threshold value – It does not prevent nodes from receiving messages such as ADV or REQ below its low-energy threshold, but prevents the nodes to handle a DATA message below the threshold SPIN [Kulik+ 2002] SPIN Protocols 3. SPIN-BC: A Three–stage handshake protocol for broadcast media – Improves upon SPIN-PP for broadcast networks by using cheap, one-to-many communications, meaning that all messages are sent to broadcast address and processed by all the nodes that are within transmission range of the sender – This approach is often called broadcast-message-suppression – SPIN-BC has three main differences from SPIN-PP are: All SPIN-BC nodes send their messages to the broadcast address such that all nodes within the transmission range of sender will receive message Upon receiving ADV message, each node checks to see if they already have the data. If not, node sets a random timer to expire, uniformly chosen from a predetermined interval. After timer expires, the node sends an REQ message to the broadcast address, including the original advertiser in the header of message. When the nodes who are not original advertiser receive the REQ, they cancel their own request timers, preventing from sending out redundant copies of the same REQ The nodes will send out the requested data to the broadcast address only once to get the data all its neighbors. It will not respond to multiple requests of the same data SPIN [Kulik+ 2002] SPIN Protocols 4. SPIN-RL: SPIN-BC for lossy networks – Reliable version of SPIN-BC which disseminates data through a broadcast network even in the cases of network loses packets or communication is asymmetric – Adds two adjustments to SPIN-BC to achieve reliability: Each node maintains a record of which advertisements it hears from which nodes, and if does not receive the data within a set time after request, node rerequests the data Nodes limit the frequency with which they will resend the data, meaning that it will wait for a set time before responding to any additional requests for the same data SPIN [Kulik+ 2002] Advantages: – Meta-data negotiation and resource adaptation – Maintains only local information about the nearest neighbors – Suitable for mobile sensors since the nodes base their forwarding decisions on local neighborhood information Disadvantages: – It cannot isolate the nodes that do not want to receive information; unnecessary power may be consumed SPIN [Kulik+ 2002] Suggestions/Improvements/Future Work: – Study SPIN protocols in mobile wireless network models – Develop more sophisticated resource-adaptation protocols to use available energy well – Design protocols that make adaptive decisions based not only on the cost of communicating data, but also the cost of synthesizing it DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] – This work considers searches over semantically rich high-level events, and presents the design, analysis, and numerical simulations of a spatially distributed index that provides for efficient index construction and range searches – The conventional approach to storing time series data is to have all sensing node sending their data to a central repository external to the environment – While obtaining the flexibility of processing the data, sending every sensor reading to external site incurs high energy consumption – In addition, the links near a gateway or an external storage repository can become communication bottlenecks as the network size and the sensed data increase – As a result, it may be advisable to store data locally at or near the location of the generation of the sensed data DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] – One approach to retrieve this stored data is to flood a query to all nodes that may have suitable data and have those nodes send their response to the querying node – In this approach, data is sent when and where it is required – If some queries are originated within the sensor network, it is not advisable to send the data to an external site instead of sending it to the internal querying data – If more data is collected than required, this local storage approach increase energy savings – There are two extensions to this approach for further energy savings: 1. Data can be processed, aggregated, and/or pruned while propagating towards the query sink DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] – There are two extensions to this approach for further energy savings: 1. The developers of Directed Diffusion [Intanagonwiwat + 2000],TAG [Madden+ 2002], and others describe specific forms of in-network aggregation and pruning of data that can select relevant data and produce statistics. This approach uses “data-centric” routing that queries are not directed towards individual nodes, but they are stated only in terms of desired data 2. The data can be processed locally to identify high-level “events” that of interest. These events can refer directly to sensor readings. The queries are directly for such events, and the responses comprised of summarized data about those events. Here, the routing is also data-centric, but queries and responses interact with higher-level abstractions DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] – These energy savings approaches reduce the energy required to respond to queries, but do not deal with the cost of basic “flood-then-respond” approach in that cost of flooding each query to all possible nodes – “Data-centric storage” (DCS) approach [Shenker+ 2002] avoids the flooding of queries -- all events are named and stored at a network location based on the name and queries for an event are routed to appropriate network node where the relevant data can be accessed – Storing data by name allows creation of a mechanism between data and queries such that queries need not be flooded – GHT [Ratnasamy + 2002] proposes a specific solution to achieve DCS in which event names are hashed to geographic locations and stored at the node closest to the hashed location DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] – DIFS extend the the data-centric storage architecture to support range queries where only events with attributes in a certain range are desired. It provides for low average search and storage communication requirements and tries to balance these requirements over participating nodes – DIMENSIONS [Ganesan+ 2002] also relies on the placement of data within the sensornet and use of data-centric rendezvous points with lower level sensor readings and produces a multiresolution index (or view) of data High-Level Events – High-level events, such as a hot region or a target detection, a map, or a histogram can be described in many ways – The paper propose adding new data structures to store high-level data abstractions to the simple attribute types introduced by Diffusion – Such abstractions would be defined system-wide at deployment time DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Classification of Event Properties and Relationships – Classification proposed has been designed with the consideration of attribute range and distribution queries – The goals of a system directed at binary events such as “zebra sightings” are different from the goals of providing range searches over events that are each comprised of attributes with values – The goal of a search over binary events is to determine the locations of those events and when such events are rare, it is much more energyefficient to construct a rendezvous point where events could register and queries could search than to flood a search – Events defined by attributes with values that fall within a specified range are less common, i.e., there may be many hot regions in a network, but few with a heat gradient with a slope greater than s DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Classification of Event Properties and Relationships – For this reason, this paper develops a new method to support range queries more efficiently and proposes mechanisms to run on top of GHT to address range queries – The high-level events are classified as follows: 1. Sensor value(s): o Includes raw sensor values that comprise high-level events, composite measurements and summary statistics such as average, median, etc o Examples include the peak temperature of a hot region, the speed that an animal target is moving o Sensor values can be search over a designed area and they are represented as integers or floating point numbers DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Classification of Event Properties and Relationships 2. Timing parameters: o Essential to know not only a specific value for a region, but also how this value varies over time, I.e., a hot region that has been hot for some period of time 3. Spatial dimensions: o Refers to physical shape and location of an event, i.e., hot regions larger than a given area o Regions can described as enclosing circles, ellipses, or polygons and their points of interest can be represented as integer or floating point coordinates DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Classification of Event Properties and Relationships 4. Event Interrelationships: o In the spatial domain, relationships between events translates to proximity or intersection, i.e., is an area of high CO2 concentration also an area of bright sunlight? o In the temporal domain, event interrelationships translate to succession and temporal separation, i.e., did an area of high CO2 concentration happens immediately after bright sunlight? Table 1: Event Property and Relationship Classification DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Storage and Search Architecture – Time series data generated by sensor nodes is locally processed by statistical and pattern recognition engines to generate high-level events that these events are stored locally where they are created, and information about their various attributes is inserted into indices – An interested user or an automaton poses queries to these indices – The query results are found in the indices themselves, at the storage nodes, and even at the nodes that generate time series data – In terms of event generation and search, nodes serve two functions: 1. all nodes may be used to store raw time series data and events 2. a subset of nodes serve as index nodes to facilitate search DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Storage and Search Architecture Figure 2: A storage and search architecture DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Advantages: – DIFS efficiently supports range queries and queries related to distribution of values in space by using histograms, that direct queries to the relevant nodes – The paper builds on an already proven technique and simulation results show that DIFS outperforming GHT in query and communication costs – DIFS was designed to incorporate balancing of communication load over the network by having more than one query entry point and provision to originate search at any node in the tree – DIFS is scalable to large number of searches or stores as it eliminates the restriction of propagating every data information to the root and originating every query at the root DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Disadvantages: – No mentioning about the failure of sensor nodes at level of the hierarchy in the quad tree structure of DIFS – In case of dense deployment, a uniform distribution of data values causes the DIFS algorithm exploring all the leaves; hence not a very good option as far as energy consumption is considered – No mentioning about making the querying and event insertion resilient to packet loss – Overhead incurred while maintaining extra parent information DIFS: A Distributed Index for Features in Sensor Networks [Greenstein+ 2003] Suggestions/Improvements/Future Work: – Introduce dynamic repartitioning when the distribution changes over a time period – To handle large queries, may be they can be split into smaller sub-queries, encoding them to be identified later and process them separately, either locally or forwarding to other nodes that have lesser traffic – this will avoid energy depletion of the really busy query access nodes – Handle data corruption at index nodes – Improve DIFS search cost o route the query using hierarchical dissemination, as in structured replication, rather than sending unicast messages to each of the covering nodes o route to nodes in the highest tree level that will cover the entire query range, rather than decomposing the query range into minimal covering set Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] Introduction – This work presents a new event-based communication model – The proposed protocol called Topology-Divided Dynamic Event Scheduling (TD-DES), organizes the wireless network into a multi-hop network tree – The root of the tree creates a data dissemination schedule and propagates this schedule throughout the tree – The schedule is divided into fixed-size time slots, each indicating the type of data that are sent (or received), and whether it is for downstream (i.e., away from the root) or upstream (i.e., toward the root) communication – The schedule can be periodic or refreshed in arbitrary intervals, depending on the data traffic and applications -- the idea is that nodes can save energy by powering down their radios to standby mode when they have no data to send, and when they (and their descendants) do not wish to receive the data being transmitted Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] Introduction – The system uses the publish/subscribe model: each node has a specific subscription profile that indicates which data types the node is interested in receiving – TD-DES allows each node to selectively listen for interested data based on the its position in the network topology – Since data must be scheduled before it is sent, the main tradeoff investigated is increased power efficiency in exchange for sub-optimal message dissemination latency – This work addresses application-specific scheduling and data dissemination issues, which was not taken into consideration by the previous in this area Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II. System Model – TD-DES is intended as application overlay to a CSMA/CA wireless MAC layer rather than a MAC/networking layer in itself II.A Scheduling Model – TD-DES monitors when each node of a network (1) receives data, (2) transmits data, and (3) powers its radio down to a low-power standby mode – These radio modes – Tx, Rx, and standby – are cycled among as functions of time determined by the network’s dissemination schedule, generated by the root node and propagated down the tree as part of a control event – The base station is considered to be the root node with higher computational, storage, and transmission capabilities than the rest of the nodes and it can serve as an entry point to the sensor network, integrating the sensor network with the external wired network where the monitoring task GUI resides Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.A Scheduling Model – The scheduler depends on topology information, event profiles, traffic statistics, and QoS requirements when generating dissemination schedules – The goal of the scheduler is to minimize network-wide power consumption (by minimizing the amount of time spent in the Rx and Tx modes) without sacrificing timely dissemination of data II.B Network Model – TD-DES has an integrated network construction layer that organizes a wireless network into a tree topology – The topology is constructed by broadcasting advertisements from all nodes – First, the root node broadcasts a parent advertisement – Each node hearing this advertisement replies with a child message that indicates that the node will become a child of the root Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.B Network Model – Whenever a node becomes a child, it broadcasts its own parent advertisement – The process continues until all the nodes get attached to the tree – A node that hears multiple parent advertisements chooses its parent node with the lowest hop count to the root – The tree construction layer is adaptive to topology changes due to node failures, additions, and mobility – The data events are disseminated throughout the network based on pernode event description rather than point-to-point messaging – This publish/subscribe type of event-based communication is the data dissemination model of choice since it decouples the producers and consumers of information Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.C Data/Event Model – Overlaying applications define predefined event types and these event types are maintained in a global event schema – For instance, a network with n different event types may publish event types e1, e2, e3,…, en – Each node maintains its own event subscription which is the set of event types that a node is interested in as well as its own effective subscription which is the union of its own subscription and the subscriptions of all its descendants – Each node subscribes to any event type of its own interest as well as any event type of a descendent node is interested in since each node is responsible for forwarding all relevant events to its descendants in the tree topology Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.C Data/Event Model (e1,e2) e2 (e1) e2 (e1,e2) N3 N2 N1 [e1,e2,e3] e2 [e1,e2] (e1) e2 [e1,e2] (e2) N4 N5 [e1,e2,e3] e2 [e2] (e2) N6 [e1,e2,e3] (e3) N7 [e3] e2 (e1,e2,e3) N8 [e1,e2,e3] Figure 3: An example dissemination tree Subscriptions are given at the upper left corner of each node, effective subscriptions at the upper right. Arrows indicate the links over which the event is broadcast Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.C Data/Event Model – Figure 3 presents a dissemination tree of eight nodes and three event types e1, e2, and e3 with N1 being the root node of the tree – The subscription of each node is given in parentheses at the upper left of the node and the effective subscription is given at the upper right of each node in square brackets – Note that an event of type e2 generated at node N5 – The arrows indicate the links across which the event is broadcast to disseminate the event to all subscribing nodes – Note that the event is propagated both upstream (to the root and then downstream to the interested parties in the other sub-tree) and downstream; therefore, events do not always go through the root node Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.C Data/Event Model – An event is a message type with its own unique application-specific semantics – Consider a scenario where a sensor network whose purpose is to detect fires is deployed over a forested region – A sensor node may issue a fire_detected event to the network if its temperature reading is very high – This event would be disseminated through the network to all those nodes, (such as forest ranger stations, a centralized forest fire monitoring station, or a sink node which could notify the police, local fire-fighting units, and public news services) subscribing to fire_detected events – These nodes can also include any intermediate nodes which had to forward such events to interested nodes, even if themselves may not be interested Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] II.D Application-defined QoS – Besides carrying unique type semantics, event types may be associated with network-specific physical characteristics, such as minimum and maximum event payload sizes, latency constraints, and relative event priorities – The overlaying applications specify such event latency and priority values III. Protocols – TD-DES event schedule determines the temporal partitioning of the RF medium for all of the event types by allocating time slots (or slots) for each event type – Each time slot is assumed to be wide enough for a single event to be propagated one hop; in other words, each slot should provide sufficient time to the underlying MAC layer to perform collision detection and retransmissions under contention Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III. Protocols – Time slots are allocated for each event based on the determined or expected bandwidth requirements needed to propagate all generated events reliably throughout the network – Once the numbers of upstream and downstream time slots for each event type are determined, the ordering of the time slots must then be determined – Iterations are intervals of schedule that starts with a control event slot and it is also possible to interleave downstream and upstream slots together to fit into a single iteration Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.A Schedule Propagation – The root node creates a schedule of time slots where each time slot is designated as a send or receive slot, whether it is for upstream or downstream communication, and by the event type which it should be used to propagate – The schedule is created one iteration at a time and passes it down through the dissemination tree inside a control event – The schedule of slots between two consecutive downstream control events is called a single iteration of the schedule – Figure 4 presents the basic idea of creating a schedule and passing it down the network tree using a scenario with downstream propagation of control and data events – The control event is created by TD-DES and contains scheduling information Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.A Schedule Propagation – The control event is received by the node at level X in the first time slot – In the next time slot, this node transmits the control event down to the next level, X+1. – In the following time slot, the node at level X+1 passes the control event down to level X+2, and so on – Basically, iterations are delimited by control events and can consist of a different number of data events – The control event initiating an iteration specifies the schedule of events within that iteration – Note that the schedule is shifted one slot at each level Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.A Schedule Propagation – The schedule has a sequence of atomic send and receive time slots, each one for a specified event type – Generally, at a given node, for a particular event in a schedule, time slots are allocated as a receive slot followed by an immediate send slot Figure 4: An example of schedule propagation Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.A Schedule Propagation – The node, if both the schedule specifies a receive time slot for event e1 and if the node subscribes to e1, will listen to the RF medium in Rx mode during this time slot to receive such an event – If the schedule specifies a send time slot for e1, the node can transmit an event of this type – Each slot is either a downstream slot (for parent-to-child communication away from the root) or as an upstream slot (for child-to-parent communication toward the root) – For downstream communication, send and receive slots are used whereas upstream slots are not designated for event types, as they are allocated – if any generated event may be able to make use of the next upstream slot Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.A Schedule Propagation – Since nodes must always listen to upstream receive slots, as all events must be passed up to the root, regardless of event type, the unique upstream slots for specific event types would not be meaningful – The downstream control event includes data used by tree construction algorithm such as the number of hops to the root and the parent node’s network-unique identifier – For each downstream send event, the simultaneous time slot at the next level down is a receive time slot for the same type of event – Similarly, for upstream send events, the concurrent time slot at the next level up is a corresponding receive time slot for the same type of event Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.B Deterministic and Speculative Scheduling – TD-DES schedules time slots in two modes: deterministic and speculative where the deterministic algorithm is used for downstream and the speculative algorithm for upstream dissemination – It is assumed that most event propagation would be downstream – In the deterministic algorithm, events are propagated in back to back iterations where each iteration is further divided into slots of fixed width – The scheduler (root node) knows the exact events to be broadcast at the beginning of each iteration and allocates the number of slots required accordingly – The schedule is propagated to every node in the form of a control packet at the beginning of each iteration Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.B Deterministic and Speculative Scheduling – A control packet can also contain timing information for the next control packet, if iterations are not fixed length – When the root node starts transmitting events, each node leaves radio in Rx mode for the duration of the slot when some interesting event will arrive – Figure 5 presents the process of deterministic scheduling – R and S denote the receive and send slots for the control events – Event e1 generated during iteration k cannot be scheduled till iteration k+1 – The control event transmitted during the second S includes the schedule for iteration k+1 – The exact time slot during which e1 will be scheduled is determined by the specific ordering criterion Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.B Deterministic and Speculative Scheduling Figure 5: Deterministic Scheduling – In speculative scheduling, the scheduler estimates the expected frequency of event types at the root node and pre-allocates slots based on this frequency estimation – Since allocation of slots for each event type is periodic which means the same from one iteration to the next, no schedule broadcasting is needed except when updating schedule Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.B Deterministic and Speculative Scheduling – The drawback of speculative scheduling is that the nodes may have to stay in Rx mode for scheduled slots regardless of whether or not event is coming – Figure 6 presents the process of speculative scheduling – Event e1 is received during iteration k after its scheduled slot (indicated by the dashed lines), therefore, e1 needs to be queued before it can be transmitted during its slot in iteration k+1 Figure 6: Speculative Scheduling Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.B Deterministic and Speculative Scheduling – The schedule decided by the root node is known to every node irregardless of the algorithm used – A child node’s downstream schedule is one slot behind its parent node’s downstream schedule, whereas a child node’s upstream schedule is one slot ahead of the parent node’s upstream schedule – This allows tight pipelining: a downstream/upstream event received by node i in slot t will be sent downward/upward to i’s children/parent in slot t + 1 – If shifting happens at the boundary of upstream and downstream schedule, downstream scheduling will shift beyond the neighboring upstream schedule and similarly, upstream scheduling will shift beyond the neighboring downstream schedule Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.B Deterministic and Speculative Scheduling – The schedule created by the root node can be extended at an internal node to accommodate events generated at internal nodes – Since a sub-tree rooted at an internal node may not be interested in every event; therefore, when an internal node is propagating down root schedule to its descendants, it can extend the root schedule by replacing those un- interesting slots with its own events or if more slots are required, it can modify blank slot in the root schedule – This extended schedule only affects the sub-tree rooted at this internal node Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – Determines how the root node decides on event ordering in the downstream schedule – When iteration length and slot length become fixed, a deterministic schedule becomes an ordering of events and it is determined according to one of (or combination of) three criteria: – o priority - the relative priority of an event type over other event types o popularity - the number of nodes subscribing to an event type o latency constraint - the max. dissemination delay for an event type Priorities can be specified by the application-layer for event types at the root node and passed down the tree within the downstream control event – If the priorities are relatively fixed, they need to be included in the control event in case of new event types are added or the priorities change Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – Events can also be ordered by popularity – It is assumed that the event types that are most subscribed to are considered most important by the system, so they are scheduled first in the upcoming iteration(s) – The tree-construction and maintenance layer of TD-DES gathers the popularity of each event type in a bottom up manner – Consider a subscription to a specific type of event ei – Each node p maintains count(ei) indicating how many nodes in its sub-tree are interested in this event of type ei Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – Using subscript to indicate location of the variable, countp(ei) can be computed recursively by where 1 is for case p itself is subscribed; 0 indicates otherwise – If latency constraints are specified by the application layer, TD-DES will use the average- and worst-case latency dissemination estimates when scheduling events – The overall dissemination latency of an event can be reduced by scheduling it as early as possible – reduces the scheduling delay component of the latency Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – Hop-count-based distance is used as an estimator instead due to unavailability of real latency at time of scheduling – The number of hops from root for a node k subscribing to event type ei is called the distance of ei at node k – o distanceavg(ei): avg. distance for all nodes subscribing to event type ei o distancewst(ei): the worst-case distance The tree gathers data by having each internal node maintain partial values for its own sub-tree and pass these values up to its parent node in its upstream control event Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – Each node j maintains the following metrics in addition to count(ei), which it passes to its parent: o costj(ei): the total number of hops an event ei must be propagated to the entire sub-tree rooted at the current node j o avg_costj(ei): the average number of hops an event ei must be propagated per interested node inside the sub-tree rooted at the current node j o max_costj(ei): the maximum number of hops an event ei must be propagated to an interested node inside the sub-tree rooted at the current node j – Each node j passes its costj(ei) and max_costj(ei) values to the parent as parameters of its upstream control event Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – For each child j of an internal node k, the parent node calculates its own values recursively; the costk(ei) at k is calculated in terms of each child: – The maximum cost value is the maximum of the maxima of its children plus 1: – At each node, the avg_costk(ei) is a derived value of countk(ei) and costk(ei): Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.C Scheduling Criteria – For each child j of an internal node k, the parent node calculates its own values recursively; the costk(ei) at k is calculated in terms of each child: – The root node, r, defines, for each event type ei, the system-wide count and distance values in the following way as: count(ei) = countr(ei), distanceavg(ei) = avg costr(ei), and distancewst(ei) = max costr(ei) – Since all internal nodes are interested in knowing these three values, the root node disseminates these values in a downstream control event as they change Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] III.D Interleaved Scheduling – This stage determines the actual sequence of the time slots allocated for the next iteration – The number and ordering of events in downstream schedule and number of slots in upstream schedule are complete – The sequencer must derive a set of ordered slots for the next iteration from these two schedules – Two choices: either place upstream and downstream slots separately side by side (a.k.a. clustered) or interleave them – In the clustered version, the ordered downstream set is placed unbroken, followed immediately by ordered upstream set and followed by some blank time slots – A downstream control event is placed at the beginning of each iteration Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] Advantages: – The authors strive to achieve maximum power conservation by way of completely powering down the radio of the sensor nodes during the portions of the schedule that do not match the its particular event subscription – The authors did not try to reinvent the wheel by introducing a radically new protocol – proposed protocol, TD-DES, is intended as an application overlay to the already established CSMA/CA wireless MAC layer – The publish/subscribe style of event based communication makes the protocol well suited for dynamic ad hoc environment Disadvantages: – Does not consider transmission failure – No mentioning about the construction of the topology tree – The time synchronization is an assumption made by the authors Power-Efficient Data Dissemination in Wireless Sensor Networks [Cetintemel+ 2003] Suggestions/Improvements/Future Work: – Since constructing a tree structure that is optimal with respect to power consumption is NP-complete, we can have the following two heuristics: o Centralized Tree-topology: In this case, we can periodically recompute the tree using centralized incremental power heuristic, where we add on sensor at a time with the least incremental transmit power o Distributed Tree-topology: Decision on the sensor nodes position in the tree is done locally by collaborating with the neighbor nodes – As mentioned in the literature, we can extend the protocol to include upstream and downstream aggregation and caching – Future work can be summarized as follows: o Implementation and clock synchronization o Mobility and reliability o Caching and aggregation References [Agrawal+ 2001] D.P. Agrawal and Arati Manjeshwar, Teen: a routing protocol for enhanced efficiency in wireless sensor networks, In Proceedings of Tenth International Conference on Computer Communications and Networks, 2001, pp. 304-309. [Cetintemel+ 2003] U. Cetintemel, A. Flinders, and Y. Sun, Power-Efficient Data Dissemination in Wireless Sensor Networks, In proceedings of the 3rd ACM International Workshop on Data Engineering for Wireless and Mobile Access (MobiDE’03), September 2003. [Ganesan+ 2002] D. Ganesan, D. Estrin, and J. Heidemann, DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks?, Proceedings of First Workshop on Hot Topics in Networks (HotNets-I), October, 2002. [Greenstein+ 2003] B. Greenstein, D. Estrin, R. Govindan, S. Ratnasamy, and S. Shenker, DIFS: A Distributed Index for Features in Sensor Networks, In the Proceedings of First IEEE International Workshop on Sensor Network Protocols and Applications, May 2003. [Heinzelman+ 2002] W. Heinzelman, A.P. Chandrakasan and H. Balakrishnan, An Application-Specific Protocol Architecture for Wireless Microsensor Networks, IEEE Transactions on Wireless Communications, Vol. 1, No. 4, October 2002, pp. 660-670. [Heinzelman+ 2000] W. Heinzelman, A.P. Chandrakasan and H. Balakrishnan, Energy-Efficient Communication Protocol for Wireless Microsensor Networks, IEEE Proceedings of the Hawaii International Conference on System Sciences, January 4-7, 2000, Maui, Hawaii. [Intanagonwiwat + 2000] C. Intanagonwiwat, R. Govindan and D. Estrin, Directed Diffusion: A Scalable and Robust Communication Paradigm for Sensor Networks, In Proceedings of the Sixth Annual International Conference on Mobile Computing and Networks (MobiCOM 2000), August 2000, Boston, Massachusetts. References [Kim+ 2003] H. S. Kim, T. Abdelzaher, and W. H. Kwon, Minimum-Energy Asynchronous Dissemination to Mobile Sinks in Wireless Sensor Networks, ACM SenSys, Los Angeles, CA, November, 2003. [Kulik+ 2002] J. Kulik, W. Heinzelman and H. Balakrishnan, Negotiation-Based Protocols for Disseminating Information in Wireless Sensor Networks, Wireless Networks 8(2-3), 2002, pp. 169-185. [Lindsey+ 2001] S. Lindsey and C.S. Raghavendra, Pegasis: Power-efficient gathering in sensor information systems, In Proceedings of International Conference on Communications, 2001. [Madden+ 2002] S. Madden, M.J. Franklin, J.M Hellerstein, and W. Hong, TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks, In Proceedings of Fifth Symposium on Operating Systems Design and Implementation (OSDI), Boston, Massachusetts, December, 2002. [Ratnasamy + 2002] S. Ratnasamy, B. Karp, L. Yin, F. Yu, D. Estrin, and R. Govindan, GHT: A Geographic Hash Table for for Data-Centric Storage, In Proceedings of First ACM International Workshop on Wireless Sensor Networks and Applications (WSNA 2002), Atlanta, GA, September, 2002. [Shenker+ 2002] S. Shenker, S. Ratnasamy, B. Karp, R. Govindan, and D. Estrin, Data-Centric Storage in Sensornets, In Proceedings of First ACM SIGCOMM Workshop on Hot Topics in Networks (HotNets 2002), Princeton, NJ, October 2002. [Tilak+ 2002] S. Tilak, N. Abu-Ghazaleh, and W. Heinzelman, A Taxonomy of Wireless Micro-Sensor Network Models, Mobile Computing and Communications Review (MC2R), vol. 6, no. 2, April 2002, pp. 28-36.