Using Continuous Queries for Event Filtering and Routing in Sparse MANETs Katrine Stemland Skjelsvik, Jarle Søberg, Vera Goebel, and Thomas Plagemann University of Oslo, Department of Informatics P.O. Box 1080 Blindern, N-0316 Oslo {katrins, jarleso, goebel, plageman@ifi.uio.no} Abstract In our project, we design middleware services for emergency and rescue scenarios in sparse Mobile AdHoc Networks (SMANETs). One of these services is a distributed event notification service (DENS) for asynchronous communication. The DENS architecture allows several subscription languages for different kinds of subscriptions. To support complex subscriptions, we use a Data Stream Management System (DSMS) for filtering of the data streams and matching of filtered events and subscriptions. We have designed a simple rescue scenario to investigate the possibilities of using a DSMS together with the DENS middleware layer. In this paper we discuss the design of our implementation and first experimental results of our ongoing work. 1. Introduction and Background In the Ad-Hoc InfoWare 1 project, we focus on developing middleware services for sparse Mobile Adhoc Networks (SMANETs) used for information sharing in emergency and rescue operations [1]. The network is formed by the rescue personnel's wireless devices, like mobile phones, PDAs, and laptops. SMANETs require different solutions than ordinary Mobile Ad-hoc Networks (MANETs), since not all nodes can be reached at all time, and delivery of messages may have to be performed by store-carryforward operations. These operations let the source node or intermediate nodes store data temporary, until the destination node is reachable. 1 The Ad-Hoc InfoWare project is funded by the Norwegian Research Council (NFR) under the IKT2010 Program, 2003-2007. Some examples of applications in a rescue operation are context-aware medical diagnosis and treatment support, and real-time evidence collection and management. The use of publish/subscribe services for communication between components in such distributed applications has been widely accepted. The reason is that the consumers of information, the subscribers, do not have to be in direct contact with the producers of the information, the publishers. The subscribers and publishers are decoupled in time and space. This is especially suitable in SMANETs where the density of nodes may be low, and where partitioning and merging occurs because of disruptions, disconnections, and movement. Our delay tolerant distributed event notification service (DENS) provides such a service [2]. The DENS runs on some of the nodes in the network. We call these nodes DENS mediators, and these nodes form an overlay network in which they send notifications of events. In the context of MANETs and SMANETs, the DENS mediators may also be mobile. We perform source filtering of events to reduce the resource consumption and to avoid sending unnecessary data. Simple filtering of stateless data can be expressed in attribute operator value tuples, such as attr: temp > 38. However, having the possibility for more complex subscriptions such as expressing interest in stateful composite events, i.e., events that are dependent on previous events, can be useful. One might for example be interested in average temperatures in a predefined period, or trends telling how the temperature changes depend on pulse readings. To express such a subscription, we need a more expressive language than just an attribute operator value filter. One possible example of such a language is SQL. SQL provides a declarative way of expressing subscriptions as queries on well-defined schema based tuples. In our rescue scenario, typical data sources are sensors attached to a patient by a paramedic (PM). The PM subscribes to health data like pulse, temperature, or blood pressure publications of the patient. The sensors continuously produce data tuples that contain information about the sensor readings. Since these tuples arrive continuously and in streams, we suggest using the general concept of a continuous query (CQ) language [3]. A CQ differs from the traditional database query in that it continuously obtains data from a data stream and assesses these data in its query setting. A CQ subscription language (CQ-SL) is then a declarative language that can describe stateful events in a simple way. In order to support CQ-SL subscriptions, we use a data stream management system (DSMS) [3]. In a DSMS, the data streams flow through main memory while one or even several CQs are executed on these data streams. The result from the CQs are data streams of tuples, which in our scenario can be mapped to publications. A DSMS is used by DENS for two tasks: First, to perform source filtering of sensor data tuples on the publisher nodes. As mentioned, source filtering is performed to prevent unnecessary tuples to be sent. The second task is to match the stream of filtered events, i.e., the notification stream, and the CQ-SL subscriptions at the DENS mediators. Filtering events at the source with a given DSMS seem to be relatively straight forward, and is similar to using a DSMS to filter data from sensors in a sensor network [4]. Matching notification streams on the DENS mediators is similar. The main contribution of this paper is how we use the DSMS for matching, i.e., relate notifications to subscriptions. The benefit of our approach is that it proposes a simple solution for filtering of CQsubscriptions at the source, for matching of notifications and subscriptions at the DENS mediators, and integrating a DSMS with the DENS. The matching process also helps achieving higher reliability, since in cases where the network is partitioned not all interested subscriber nodes may have been able to send a subscription to the publisher node via the DENS overlay. If another subscriber node in the same partition as the publisher node is interested in similar events, a subscriber in the other partition may still receive the notification by using store-carry-forward operations in the DENS overlay. Hence, efficient routing of notifications to interested subscribers is obtained through the matching process. The rest of this paper is organized as follows: In Section 2, we briefly describe the DENS component and protocols, and how the design of the service supports different subscription languages. In Section 3, we describe the system architecture and the application domain more closely. We also discuss design alternatives. Implementation and some early results of experiments are presented in Section 4. Related work is discussed in Section 5, and in Section 6, we conclude and discuss future work. 2. DENS in a Nutshell From the DENS perspective there are three roles for nodes in the network: subscribers, publishers, and DENS mediators. An example of a DENS application is a health monitoring application program running on a PM’s PDA. The application collects and updates the health status of injured persons. Such an application can subscribe to information about specific patients, and alerts when data values from health sensors are above or below some threshold. On the publisher node a monitoring agent, called a watchdog (WD), filters events based on given subscriptions. When a DENS mediator receives a subscription, it locates potential publisher nodes by using another Ad-Hoc InfoWare component, the Knowledge Manager [5], and sends the subscription to these nodes. A WD Manager registers it, and starts the correct filtering specific for the subscription language. When an event occurs that fulfills the condition specified in a subscription, the WD Manager generates a notification according to the specific subscription language and sends the notification to the DENS. The DENS mediator then matches the notification with the stored subscription and delivers the notification to the correct subscriber nodes. In our implementation, the WD is implemented by the DSMS. If there is no possibility of reaching the destination due to network partitioning or the destination device is turned off, the DENS overlay stores the subscription or notification and tries later, i.e., it performs a storecarry-forward operation. Furthermore, subscriptions and undelivered notifications are replicated among the DENS mediators to increase service availability and enable all DENS mediators to gracefully degrade their service if they are partitioned from the network. This means that the service is available even though some of the DENS mediators are temporarily turned off or out of communication range, and because there is no single point of failure in DENS. Figure 1 shows the architecture of a DENS mediator. There are three delivery components; SUBDENS handles communication between a subscriber and DENS, DENS-DENS communication in the DENS overlay, and PUB-DENS communication between a publisher and DENS. DENS SUBDENS DENS– DENS State Mgmt WD PUBDENS Storage Mgmt Availability & Scaling Watchdog Manager only to the PDA closest to the patient sensors. Data from the sensors are collected by the PDA, and interpreted as a data stream by the DSMS running on this node. This is shown in Figure 2. Watchdog Execution Environment PM Patient Patient Figure 1: DENS components One important design principle for DENS is the subscription language independence. This means that applications can use different subscription languages at the same time. However, to still be able to decouple subscribers and publishers, three language specific functions must be plugged into the DENS for each subscription language; look-up() to identify the destination for a subscription, filter() to filter the event at the publisher, and match() to identify the destinations for notifications. To be able to support CQ-SL we must implement filter() and match(): Data tuples from sensors are filtered by the DSMS and result tuples are sent to the PUB-DENS component. These result tuples are interpreted as a stream of notifications by the DENS mediator and are sent to the DSMS for matching. Only these functions inspect the content of subscriptions and notifications. For all other tasks in the DENS, subscriptions and notifications are just opaque packets. A subscription language ID is attached to all subscriptions and notifications and allows invoking the language specific functions. A more detailed description of how this language independence works can be found in [6]. 3. Scenario and Application Domain Typical sources of data that are of interest for a health monitoring program are health monitoring sensors, which sense data like blood pressure, body temperature, and pulse of the patient the sensors are attached to. The application alerts the PM when data values from the health sensors are above or below some threshold defined by a CQ-subscription sent by the application on the PM's PDA. There are two kinds of queries that might be stated with respect to the relevant publisher nodes. The first kind expresses interest in any patient having for example an average temperature above a given threshold. The DENS overlay has to distribute the query to those publishers connected to temperature sensors. The second kind of query aims for a specific person, or node. The DENS overlay sends this query PM Figure 2: Scenario example A simple example of a query for a health monitoring application is: SELECT query_ID AS QUERY_NUM, AVG(bloodpr.value), AVG(pulse.value), AVG(temp.value), timestamp FROM bloodpr [SLIDE WINDOW BY '10 seconds'], pulse [SLIDE WINDOW BY '10 seconds'], temp [SLIDE WINDOW BY '10 seconds'] WHERE bloodpr.timestamp = pulse.timestamp AND bloodpr.timestamp = temp.timestamp AND temp.value > 38; This query joins three data streams and shows only those tuples, average values of blood pressure, pulse and temperature, where the temperature readings are above 38° C. The join attribute is the timestamp, meaning that only tuples having the same timestamp are joined. The average is calculated using values from the last ten seconds by using a sliding window, which in this query is a time slice of ten seconds that the average is calculated over. In this case, a result tuple is transmitted each second if the input streams match the query. Note that the DENS overlay receives the queries without any additional responsibility of for example optimization and overlapping of query statements. Interpretation and optimization of the subscription queries are performed by the DSMS. An example of an overlapping query statement is as follows: A subscriber Sa wants to obtain all tuples where values are between 35 and 40, while subscriber Sb wants all values between 37 and 39. In our scenario, handling scalability is not the main requirement. The network size is not in the order of millions of nodes, and only a subset of the nodes is interested in the same subscriptions. Different groups of rescue personnel, such as those having the role of being leaders, or fire fighters, police, or medical personnel, are in most cases interested in different kinds of events. It will therefore not be the case that a DSMS running on a PDA, which collects data from sensor nodes, needs to be able to manage thousands of subscription queries. Note also that battery lifetime is not a critical issue, as in most wireless sensor networks. The DSMS runs on a PDA and not on the sensors. And finally, the time span for a rescue operation is closer to hours/days than weeks/months. The rescue personnel may also have an additional battery supply, if this becomes an issue. Subscriptions may be inserted and deleted all the time, hence the DSMS must be able to handle “subscriptions on the fly”, i.e., manage to add new CQs as well as removing obsolete ones at run-time. The matching process of filtered events and CQsubscriptions could be omitted by using only predefined queries having globally known IDs. However, it is difficult to make an exhaustive list of pre-defined subscriptions which match all kinds of possible events. A PM should be able to make tailored subscriptions for the patients, based on age, injures, etc. It is important to bear in mind that the rescue personnel should not have to state CQ-SL queries directly on a command line interface since this would be too time consuming. A simple GUI on the PDA has to provide the rescuer with enough information so that sufficient queries can be created and sent to the DENS overlay. 4. Implementation and Experiments We have implemented a first prototype of the system architecture described in the previous section, and done some preliminary tests in an emulation environment. As emulation platform we are using NEMAN [7]. NEMAN provides a virtual wireless network of hundreds of nodes on a single machine. The advantage of emulators to simulators is that they provide a virtual wireless network at the lowest layers, but allow real code to be run on the higher layer. The processes running on the virtual nodes bind to virtual network interfaces, i.e., TAP interfaces. On each virtual node we have a DSMS process running. Data from sensors are pre-collected, and the samples are continuously read from a file and fed into the DSMS. 4.1 Implementation Issues As the DENS mediators have already been implemented for the experimental results in [8], much of the implementation issues deal with the implementation of the interface between DENS and DSMS. The DENS mediators and interfaces are written in C. The interface is split into two parts: One that is linked to the running DENS instance, i.e., the PUBDENS or DENS-DENS component, and one that is linked to the running DSMS instance. The two instances communicate using TCP. The interface linked to the DENS instance has three main functions; subscription_to_wrapper(), parse_query(), and notification_to_wrapper(). The function subscription_to_wrapper()sends the query to the DSMS. The query is encapsulated by a structure that contains the additional subscription information. The function simply takes contact with the DSMS interface and inserts the query. The function parse_query() takes a query as the argument. The idea is that the function understands the query semantics, and returns the queries the DENS mediator is supposed to send further to other DENS mediators and publisher nodes if that is required. This is ongoing work and for now this function is static, i.e., we have not yet implemented the semantic query parsing. For each CQ-subscription the DENS mediator has a table with the subscriber node ID and the query ID. The function notifycation_to_wrapper() sends notification streams on the DENS mediator node to the DSMS. The interface that is linked to the DSMS establishes a connection to the DSMS and the DENS interface when started. When this connection is successfully established, it runs a loop while polling for subscriptions from subscription_to_wrapper(), and notifications from notification_to_wrapper(). When a subscription arrives, a new process is forked and this process contacts the DSMS with the current query when a notification arrives. The event stream, notification stream, and architecture overview, as described above, are shown in Figure 3. DENS mediator DENSDENS Publisher Stream of Notifications PUBDENS wrapper wrapper TCQ TCQ Event Stream of Raw Samples Sensor Figure 3: Information flow The information flows from either the sensors as events or from publisher nodes as notifications. In either way, the process that starts the query receives the result tuples from the DSMS. For each result tuple, it creates a publication structure containing the tuple, which it sends directly to the DENS. When a DENS mediator receives a non-local publication, i.e., a notification stream, it calls notifycation_to_wrapper(), which sends the tuple information to the DSMS interface. In our scenario, all tuples sent to the DSMS have the following generic structure: <value, patient, timestamp>. These are sent to the DSMS as different streams. The schema contains all the information we need at the current stage of the implementation; the patient attribute identifies the source, and the timestamp shows when the reading is performed. When a DENS mediator receives a publication from its local DSMS, this is thus a result of the matching function. This result tuple also contains a unique query ID number. The ID is used by DENS to identify the subscribers interested in the notification by storing which queries the subscribers are interested in. The ID is created by each DENS mediator running the query, and is local for this node. Note that the events can be combined over multiple sensor data streams. This means that the notification contains all these attributes. However, at the DENS mediator, the attributes are spilt into corresponding sensor streams, which are sent to notification_to_wrapper(). This is done in order to let other queries make use of the same tuple streams. For example, a query wants to obtain tuples from both temperature and blood pressure data from a patient. The returning tuples from that query contain both these two attributes. However, when the tuple is sent to another query, both the temperature and blood pressure are sent separately. This means that the simplest way of distributing queries in the network is to copy them on each DENS mediator. For now this query is statically defined and returned from parse_query(). 4.2 Test Setup In this section, we describe the test setup. On one machine (Dual Xeon 2.8 GHz running Linux 2.6.10) NEMAN is running and emulating a scenario similar to the scenario described in Section 3. In addition to the test machine we artificially create sensor data by using scripts. In later experiments we aim to use real patient data. We have implemented the functions filter() and match() for CQ-SL using the TelegraphCQ DSMS [9]. TelegraphCQ provides a simple interface for stating queries and sending streams. Since TelegraphCQ is based on PostgreSQL, it also provides the pqsql front-end for easily integrating TelegraphCQ with a C program. N4 PUBDENS TCQ Sensor data CQ1 <Notification -> N2 DENS CQ1 <Notification -> N3 PUBDENS TCQ TCQ SUB-CQ N1 Sensor data SUB DENS Figure 4: Test setup Figure 4 shows the test setup. A subscriber node N1 sends a subscription containing the continuous query CQ1 to N2. In our scenario implementation, the query is semantically similar to the one shown in Section 3, but written in TelegraphCQ’s syntax. Based on the subscription language ID, the DENS mediator receiving the subscription decides which look–up() function to call. The look-up() function for the CQ-SL parses the subscription to find stream names that indicate what kind of information the subscriber is interested in, and returns the node IDs for the publisher nodes. In our static parsing, it returns temperature, pulse and blood pressure, and node IDs N3 and N4. Hence, the subscription is sent to these nodes. The DENS mediator stores the subscriber node ID and the locally generated query ID. It attaches the query ID to the query and sends it to TelegraphCQ running on the same node. When an event takes place that matches the query at the publisher node, a result is sent as a notification to the DENS mediator node, i.e., N2. The notification is then interpreted as a new stream that flows into TelegraphCQ at N2. The streams of filtered events are thus matched by TelegraphCQ at the mediator node, and it returns the data to the DENS together with the matching query ID. The DENS mediator can then do a look-up() in its subscription table to find the node IDs for interested subscribers, in this case N1. 4.3 Experiments and Results As noted earlier, this paper shows our initial results. We aim to demonstrate that the nodes send notifications and that our first prototype works as intended. In the experiment, we investigate how much data is transmitted between the nodes in the network, and expect that approximately the same amount of data is transmitted from each of the nodes, since the same query is distributed to each node. Even though N2 receives data from both N3 and N4, the result is an average of all the tuple values it receives, thus, a rate similar to what N3 and N4 transmit. Among others, these data will be used as guidelines for the optimization we plan to perform as part of the future work. We use the system activity reporter sar for Linux to obtain information about the transmitted data at each node. We use sar data obtained during the run of our NEMAN scenario. The final result is an average of ten runs. We send the script generated sensor data to TelegraphCQ at a rate of one tuple each second per stream. Based on the observations in [10], we can expect that TelegraphCQ manages to handle this stream as well as giving correct results. Figure 5 shows the result plot. The y-axis shows the number of bytes each second, which is shown at the xaxis. We observe that the results are close to what we expect. The three nodes send approximately the same amount of data each second. Though there are some irregularities in the amount of data transmitted, we assume this is due to routing information transmitted between the nodes. Figure 5: Emulation results 5. Related Work Most publish/subscribe systems do not support CQSL. DSMSs like STREAM [11], Borealis [12], and TelegraphCQ [9], filter data streams in real-time based on declarative queries. These systems also support simultaneous query execution, but they do not scale well, in the sense that several concurrent queries can be inserted and deleted over a given period of time. This may be an issue in publish/subscribe systems having a very large number of subscriptions and unsubscriptions. For instance in CACQ [13], which is a part of TelegraphCQ, the maximum number of queries is statically defined at compile time. However, CACQ is used to optimize the queries so that several queries can run concurrently and share resources on one node. CACQ also manages to install and remove queries from TelegraphCQ in run-time, something which is interesting for us as well. In a setting where several subscription queries run on one node, these ideas might be helping the DENS layer to optimize overlapping queries, for example. The Borealis [12] stream processing system has ways to automatically distribute queries in a network consisting of several nodes, and might be interesting to integrate into our DENS project in the future. STREAM is not suited for usage in a DENS scenario, since it does not manage installation and removing of queries at run-time. For large-scale publish/subscribe systems with millions of subscriptions and un-subscriptions efficient matching is vital. Demers et al. [14] have developed a publish/subscribe system which supports optimizations for managing a high number of stateful subscriptions, and Gryphon [15] is a publish/subscribe system which has been extended to support a CQ-SL. Since our idea is to support multiple subscription languages at the same time, not all subscriptions will be CQsubscriptions and hence processed by a DSMS. We also do not aim for Internet scale scenarios with a very large number of subscriptions, so the goal is rather to have a simple, yet efficient enough design to support our purpose. 6. Conclusion and Future Work This paper presents a solution for supporting CQSL in sparse MANETs. The DENS is language independent so different applications may specify and use different languages, as long as the WD supports the language and the DENS overlay has the correct matching plug-in. We use a DSMS both for filtering at the source and doing matching of filtered events and subscriptions in the DENS overlay. We have performed a proof-of-concept implementation of our system architecture. For future work there are three main issues we are planning to investigate: First, a simple interface for the rescue personnel is needed. As mentioned in this paper, we aim to implement a GUI that fits the requirements of a typical rescue scenario. For example, we need to find simple solutions that make it possible to build queries without necessarily seeing them in SQL-like syntax. Sometimes it might also be needed to actually just push a button to send a predefined query subscription. The second issue we are interested in investigating is how the queries can be stated so the DENS overlay manages to efficiently split queries and distribute them optimally. One possible solution is to use XML to describe the queries, and then implement translators to other query languages. In this case, we need a set of common operators that we know are used by many DSMSs and provide a mapping from XML to these operators. This should be performed transparently in parse_query(). The idea is to integrate several DSMSs in our scenarios and use XML as a metalanguage. Finally, we have experienced that most DSMSs, such as TelegraphCQ, are relatively large, i.e., over 150 MB in size. They also depend on a number of external libraries. It is therefore interesting to investigate the possibilities of using less resource intensive DSMS. One possible example is TinyDB [4], which is mostly deployed in sensor networks, but may also run on PDAs. 7. References [1] E. Munthe-Kaas, O. Drugan, V. Goebel, T. Plagemann, M. Puzar, N. Sanderson, and K.S. Skjelsvik, “Mobile Middleware for Rescue and Emergency Scenarios (book chapter)”, Mobile Middleware, Bellavista, P., Corradi, A. (Eds.), CRC Press, ISBN 0-8493-3833-6, September, 2006. [2] K. S. Skjelsvik, V. Goebel, T. Plagemann, “Distributed Event Notification Service for Mobile Ad Hoc Networks”, IEEE Distributed Systems Online, vol. 5(8), http://dsonline.computer.org/0408/f/o8002a.htm, 2004. [3] L. Golab and M. Tamer Özsu, "Issues in Data Stream Managment", SIGMOD Rec., 32(2):5-14, 2003. [4] S. Madden, M. J. Franklin, J. M. Hellerstein and W. Hong, “TinyDB: An Acqusitional Query Processing System for Sensor Networks”, ACM TODS, 2005 [5] N. Sanderson, V. Goebel and E. Munthe-Kaas, "Knowledge Management in Mobile Ad-Hoc Networks for Rescue Scenarios", Workshop on Semantic Web Technology for Mobile and Ubiquitous Applications, ISWC 2004, November 2004. [6] K. Skjelsvik, A. Lekova, V. Goebel, E. Munthe-Kaas, T. Plagemann, and N. Sandersson, "Supporting Multiple Subscription Languages by a Single Event Notification Overlay in Sparse MANETs", Proceeding of the ACM MobiDE 2006 Workshop, Chicago, USA, June 2006. [7] M. Puzar and T. Plagemann, "NEMAN: A Network Emulator for Mobile Ad-Hoc Networks", in Proceedings International Conference on of the 8th Telecommunications (ConTEL 2005), Zagreb, Croatia, June 2005. [8] K. S. Skjelsvik, V. Goebel and T. Plagemann, "Evaluation of Distributed Event Notification Protocols for Highly Unstable MANETs", Technical Report #328, University of Oslo, Department of Informatics, Septemeber 2005. [9] S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. KrishnaMurthy, S. Madden, V. Raman, F. Reiss, and M. Shah, "Telegraphcq: Continous dataflow processing for an uncertain world", Proceedings of the 2003 CIDR Conference, 2003. [10] J. Søberg, “Design, Implementation, and Evaluation of Network Monitoring Tasks with the TelegraphCQ Data Stream Management System”, Master Thesis, University of Oslo, May, 2006. [11] A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom, "Stream: The Stanford Data Stream Management System", Department of Computer Science, Stanford University, 2004. [12] D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J. H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tabul, Y. Xing, and S. Zdonik, "The Design of the Borealis Stream Processing Engine", 2nd Biennial Conference on Innovative Data Systems Research (CIDR'05), Asilomar, CA, January 2005. [13] S. Madden, M. Shah, J. Hellerstein, V. Raman, "Continuously Adaptive Continuous Queries over Streams", in Proceedings of SIGMOD, June 2002. [14] A. Demers, J. Gehrke, M. Hong, M. Riedewald, and M. White, "Towards Expressive Publish/subscribe Systems", in Proceedings of the 10th International Conference on Extending Database Technology (EDBT), 2006. [15] Y. Jin and R. Strom, "Relational Subscription Middleware for Internet-Scale Publish/Subscribe", in Proceedings of the 2nd International Workshop on Distributed Event-Based Systems, DEBS 2003, San Diego, California, June 2003.