Event-based communications in modular robots Nicolas Melot Université de Bretagne Occidentale (UBO, Brest, France) University of Southern Denmark (SDU, Odense, Denmark) nicolas.melot@etudiant.univ-brest.fr, nime@mmmi.sdu.dk August 20, 2010 Abstract Modular robots have been studied for more than twenty years, after Fukuda’s first representation of a cellular robot in 1988 ([25]). Since then, many different modular robots have been developed: chain-based, lattice-based, hybrid modular robots, each having different characteristics. But all share a common feature that is being composed of several separate, independent and autonomous modules. Each module is equipped of sensors, processor, memory and actuators. A modular robot is a aggregation of modules.Modular robots may be reconfigured, sometimes on-line (without having to reinitialize them) or even self-reconfigured. Such robots need to be controlled, either with centralized algorithms or thanks to distributed algorithms. This control requires modules to heavily communicate with each other, despite on-line physical network topology’s changes due to physical reconfigurations. These highly dynamic network topologies makes publish/subscribe communication to be particularly suited. Publish/subscribe communications are based on events, where some nodes express their interests in a given event pattern, and other nodes fires events matching or not these patterns. These events are forwarded along the links using a multicast technique, typically classic multicast diffusion mode in network protocols as IP, or other dedicated algorithms. These publish/subscribe algorithms require forwarders to know only a subset of which node published this event and what nodes are interested in it, or not require them to know it at all. This decoupling between data producers and consumers eases the management of a dynamic underlying network, so as to be still able to deliver messages to its matching receptors. The work described in this documents consists of the study of this communication scheme, its implementation on Atron modular robot through two different algorithms as well as an evaluation of their performances in various use cases. It demonstrates that in a static network environment, methods based on event routing tables shows better performances than flooding-based strategies, but they also induce a greater use of resources and a greater complexity to handle network’s dynamics. Acknowledgments I would like to thank Kasper Støy for having welcomed me in the modular robotic laboratory for these six months and give me the opportunity to learn a lot about modular robots and investigate about publish/subscribe communications. Also thanks to all people in the laboratory, in particular David Johan Christensen and Ulrik Pagh Schultz, for there support and very helpful suggestions on one hand, and for always having made me to feel good, comfortable and confident in the institute on the other hand. Also thanks to Payam, my office mate: I enjoyed to share the institute’s best office with you for this period. I’m grateful to all people in Mærsk Mc-Kinney Møller Institute with who it was a real pleasure to spend time with despite the danish language, and a good opportunity to learn about local practices such as smørrebrød and table soccer. It was a very nice opportunity to improve in these disciplines too. Finally, very special thanks to my parents who always provided me everything I ever needed and more during these years of studies. Thanks to them I could focus efficiently on studies and relax whenever the work became too stressful, without having to worry about any materialistic concern. Thank you for your very reliable support, regardless good or bad periods. Remerciements Je voudrais remercier Kasper Støy de m’avoir accueilli dans le laboratoire de robotique modulaire pendant ces six mois, ainsi que de m’avoir donné l’opportunité de beaucoup apprendre sur les robots modulaires et de découvrir les communications basées sur les abonnements et publications. Merci également à tout le monde dans le laboratoire, en particulier David Johan Christensen et Ulrik Pagh Schultz, pour leur aide et leurs suggestions très utiles d’une part, et pour toujours m’avoir fait sentir l’aise et confiant dans l’institut d’une autre part. De même merci Payam, ma collègue de bureau avec laquelle j’ai apprécié partager le meilleur bureau de l’institut pendant cette période. Je suis reconnaissant à tout le monde à l’institut Mærsk Mc-Kinney Møller, avec qui ce fut un réel plaisir de passer du temps malgré la langue danoise, et une bonne occasion d’apprendre les pratiques locales comme le smørrebrød et le baby foot. Ce fut une très bonne opportunité de m’améliorer aussi dans ces disciplines. Enfin, remerciements spéciaux à mes parents qui m’ont toujours apporté quoique j’ai pu avoir besoin et même plus pendant ces années d’études. Grâce à eux j’ai pu me concentrer efficacement sur les cours et me détendre lorsque le travail devenait trop stressant, sans avoir à me soucier des problèmes matériels. Merci pour leur aide fiable et leurs encouragements, quelque soit les bonnes ou mauvaises périodes. Contents 1 Introduction 1.1 An introduction to modular robots . . . . 1.1.1 Modular robots’ features . . . . . 1.2 Atron modular robot . . . . . . . . . . . 1.3 Control for modular robots . . . . . . . . 1.4 Publish/subscribe communication pattern 1.5 Related work . . . . . . . . . . . . . . . 1.6 Motivations and content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 4 6 7 8 9 9 2 An overview of publish/subscribe communications 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Underlying network protocol features . . . . . . . . . . . . 2.2.1 Relying on classic network features . . . . . . . . . 2.2.2 Mobile and sensors networks as underlying structure 2.3 Pub/sub network topology . . . . . . . . . . . . . . . . . . 2.3.1 Broker overlays . . . . . . . . . . . . . . . . . . . . 2.3.2 Peer to peer structured overlays . . . . . . . . . . . 2.3.3 Peer to peer unstructured overlay . . . . . . . . . . . 2.4 Subscription models . . . . . . . . . . . . . . . . . . . . . . 2.5 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Routing notifications . . . . . . . . . . . . . . . . . . . . . 2.6.1 Events routing algorithms . . . . . . . . . . . . . . 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10 10 10 11 12 12 12 13 13 14 15 15 20 3 Implementing publish/subscribe communications on Atron 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Underlying network and suitable pub/sub algorithms . . . . . . . . . . 3.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Algorithm synchronization with a distributed barrier . . . . . . 3.3.2 Publish/subscribe algorithms . . . . . . . . . . . . . . . . . . . 3.4 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 PublishSubscribe class: the publisher/subscribe interface to user 3.4.2 Subscription model interface . . . . . . . . . . . . . . . . . . . 3.4.3 Single attribute implementation . . . . . . . . . . . . . . . . . 3.5 Barrier: distributed synchronization among nodes . . . . . . . . . . . . 3.6 Event_core class: interface for algorithms’ implementations . . . . . . 3.6.1 Event flooding algorithm . . . . . . . . . . . . . . . . . . . . . 3.6.2 Event filtering-based strategy . . . . . . . . . . . . . . . . . . . 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 22 23 23 24 27 27 31 33 35 36 39 42 47 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Experiments 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Relevant features to measure . . . . . . . . . . . . . . . . . . . . . . 4.3 Measurement methodology . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Performances along number of modules in a robot . . . . . . 4.3.2 Performances along publishers/subscribers ratio in the system 4.3.3 Performances along the number of events sequentially fired . 4.3.4 Performances along event firing rate . . . . . . . . . . . . . . 4.4 Algorithms performance and comparison . . . . . . . . . . . . . . . 4.4.1 Experimental measurements of messages count . . . . . . . . 4.4.2 Space analysis . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 49 49 50 51 52 53 53 53 53 58 60 5 Conclusions 61 5.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 References 65 2 Chapter 1 Introduction 1.1 An introduction to modular robots Cellular robots is an approach in robotics opposed to monolithic robots. It originally aimed to give robots a high level of versatility and robustness rather speed and precision in specific tasks. Such robots are composed of several modules, where modules are a robotic analogy to cells in living beings in nature. Støy et al. [49], page 5 define a modular robot as a “Robot built from several physically independent units that encapsulate some of the complexity of their functionality”. However, the early idea in the 1980s refers to robots able to autonomously split in different separate modules, and being able to later recombine and form a new robot, possibly different (see figure 1.1). This is now known as self reconfigurable robots, defined in [49, pages 5 to 7] by the following criteria: • Modular: The robot is built from several physically independent units that encapsulate some of the complexity of their functionality. • Reconfigurable: The modules can be connected in several different ways to form different robots in term of size, shape or function. • Dynamically reconfigurable: The modules can be disconnected and connected while the robot is active. • Self-reconfigurable: The robot can change the way modules are connected by itself. Figure 1.1: Fukuda’s early representation of a cellular robot doing maintenance work in a storage tank (from [25]). As an example of dynamically reconfigurable robots are figures 1.2(b) and 1.2(a). It is important to pinpoint a module is a robot on its own: every module has an ability to acquire information through its sensors, process acquired information and have a behavior on its own thanks to its memory and processing unit, and it is able to autonomously influence 3 its environment using its actuators. In addition, a module is able to explicitly communicate with other modules using its communication devices and run thanks to an individual power source, regardless this power source is internal like an individual battery, or external: provided by an external power plug. (a) Odin modular robot is a lattice-type dynamically reconfigurable and self-deformable robot (from [28]) (b) Thor robot is composed of several different modules types which share a mechanical strength from dedicated modules (from [35]). Figure 1.2: Two modular robots developed in Mærsk Mc-Kinney Møller Institute (University of Southern Denmark). 1.1.1 Modular robots’ features Goals of modular robotic Due to their modular nature, cellular robots present interesting features. First of all being modular allows an high level of redundancy. The consequence of a failing module over the entire robot lowers, as robots evolve from modular to self-reconfigurable robot: the failing module may be easily replaced on a modular robot, and even more easily on a reconfigurable modular robot since it is one goal of its design. A dynamically reconfigurable robot may just halt its work until the module is replaced, then resume its task where it was stopped. Finally, a self-reconfigurable robot can be self-repairing: it can replace by itself the defective module and resume its work without needing an external intervention to handle the failure. Reconfigurable robots can also be versatile, as its modules can be combined and recombined in different ways to form the the basis of a wide range of different robots. They are adaptable as they can morph to fit better different tasks or environments. Finally, modular robots are cheap compared to their complexity: as a modular robot is composed of several modules of one or very few different types, these modules can be mass produced, reducing the overall price of a robot. All these features are not necessarily true for every modular robots. As these characteristics may require more or less complex controller and mechanical structure, the features described above are often existing to a limited degree. Hence being robust, versatile, adaptable and cheap to mass produce only represents an ideal goal. However, these characteristics guide the design of all modular robots. Self-reconfiguration in modular robots Different self-reconfigurable robots use different strategies to self-reconfigure. There are three different techniques for a self reconfigurable robot to morph, thus three different types 4 of self-reconfigurable robots: chain-type robots, lattice-type robots and hybrid robots. Chain-type robots are mainly intended to locomotion purpose. They are very effective in moving but as a drawback, self reconfiguration of chain-type robots is very slow. A chain-type robot is composed of chains of modules connected in a tree topology. Each module has typically one or two rotational degrees of freedom perpendicular to the chain it is connected to. As a consequence, the more a chain is long, the more it is flexible. Such robots have a wide range of locomotion: walking, crawling and even rolling; thus they are particularly suitable to stride any type of terrain. However, self-configuration in chain-type robots is very difficult and time-consuming. It consists in two steps: first a chain bends to move its ending module close to the one it wants to connect to, then the two modules look for each other, align and slowly approach to finally connect. Many self-reconfiguration can be performed sequentially so that the whole robot can completely change its general shape, allowing it to have very different locomotion styles. For instance, a chain-type robot can start as a long chain and crawling. After a while, it can change its shape through several reconfigurations and become a legged robot, or it can even take the shape of a circle and roll. An example of chain-type self-reconfigurable robot is shown in figure 1.3. Figure 1.3: Polybot chain-type robot in a wheel configuration (from [54]). Lattice-type robots organize their modules in a lattice structure similar to atoms in a crystalline solid. It means that every modules are restricted withing a certain set of position and orientations and can only move from one position in the crystalline shape, to oneanother. Lattice-type robots are not as good as chain-type robots concerning locomotion. On the other hand, they are more adapted to self-reconfigurations, which is relatively fast compared to chain-type robots’: modules no longer have to look for each other and synchronize where reconfiguring. Instead, modules can move with fewer help from their neighbors and directly connect to them, assuming they are in the right position and orientation. There are several ways such modular robots can achieve reconfigurations: modules can move from a position to another by rolling around a neighboring module; modular robots implementing this technique are usually composed of modules with two sub-modules occupying two neighboring positions. Alternatively, a modular robot can exploit contractions and expansions to move around the lattice: two modules contract so that they fit into the same lattice position, and pull a third module with them. Then they expand in another direction and push other modules. Finally, modules can use a track to slide along a surface of other modules in this direction. This last strategy has been implemented in two dimensions but remains hard to generalize in three dimensions. Lattice-type robots can only perform a very limited type of locomotion through reconfiguration. It is often called cluster-flow locomotion as each module move from rear to front, making the overall robot to change its location. Hybrid robots exploit both advantages from chain-type and lattice-type robots. They can be alternatively be chain-type or lattice-type, but never both at the same time. They use chain-type form to perform an efficient locomotion, then take profit of the simplicity provided by lattice to achieve a self-reconfiguration into another shape. When reconfigured, they switch back to chain-type form to get an efficient locomotion. It is interesting to note that hybrid modular robots are not more complex than lattice ones. Therefore, most modern robots are hybrid as functionalities from both techniques come at no extra cost. 5 Pack, Herd and Swarm Modular robots may differ in the number of modules required to form the general robot, and how small the modules are. Støy et al. [49] define three main categories named pack, herd and swarm robots. Pack robots consists of one or several tens of modules. The modules have a strength comparable to to group’s strewn. It means that a module can lift itself, as well as a significant portion of the complete modular robot. Thus a module can be a complete functional unit such as a leg. In such robots, each module plays an important role and they all must be strictly coordinated to allow the robot to achieve its final goal. Central control of pack modular robot is possible, although distributed control algorithm might be more suitable. Støy et al. [49] compare this type of robots to a pack of wolves, where every individual performance has a significant impact on the hunt. Herd modular robots consists of hundreds of modules. Their individual strength has moderate influence on the global group and an individual’s functionality is limited. A module in a robot belonging to the herd group can still lift itself, but it is not sufficient to be, alone, a functional unit; instead, a functional unit is always composed of several modules. A herd robot has enough redundancy to allow less strict coordination and can more easily handle modules’ failures. Robots in a functional unit must cooperate to achieve a goal, but not as tightly as modules in pack robots. A central control of herd robots may results in a performances drop. Modules in this kind of modular robot may be compared to deer in a herd; they are called herd robots. Swarm robots lies in myriads of modules. Each of them is weak and have a limited individual impact on the whole robot. They must cooperate together to have a significant influence on the whole group; thus a functional unit is composed of a massive number of modules and a massive number of functional units is required to build a whole complete swarm robot. This kind of robot cannot be controlled centrally, and every module must have a high degree of autonomy. They behave similarly to swarms and cannot be directly controlled. Instead, they live and act through their own rules similarly as swarm of bees or ants. This classification is important, not only from the mechanical point of view, but also regarding controllers. In fact, pack and swarm robots are obviously controlled in a different manner: while the former need a tight coordination and can even be controller centrally, this high need of tight coordination makes impossible the use of stochastic algorithm. On the other hand, swarm robots’ modules are too numerous too be precisely controlled: thus the only option relies on a certain degree of randomness in control algorithms, while various failures can be handled by the high modules redundancy available on such robots. Therefore controllers for pack robots cannot apply for swarm robots and vice-versa. Finally, herd robots present a problem as they can hardly be controlled by a centralized or a deterministic distributed algorithm. The number of modules as well as there degree of freedom makes tight control difficult. On the other hand, the reduced number of modules makes any probabilistic algorithms hardly reliable. Thus herd robots may be hard to control. 1.2 Atron modular robot Atron modular robot is a self reconfigurable robot composed of several identical modules. These modules, typically a few dozens, are organized in a lattice and tight together through mechanical connectors. Each module is made of two hemispheres able to rotate around a common axis; this rotation capability represents modules’ only actuator. Each hemisphere is also equipped with two male connectors, two female connectors and four infrared communication devices. Using its modules’ actuator and connectors, an atron robot is able to self-reconfigure. When two modules are tight together, infrared communication devices of each modules are in front of each other, and a physical network link is created. 6 (a) Atron modular robot in different configurations (b) Main components of an atron module Figure 1.4: An atron robot and an atron module. Atron modules are independent robots on their own: they all individually have their own sensors, actuator, memory and processor. Several versions of atron modules have been designed; the latest developed version includes an Atmega128 8 bits processor, 4KB of memory and 128KB flash memory. Each atron module runs a custom version of TinyOS based on version 2.1.0, in which is added components to manipulates sensors, connectors, the actuator and half-duplex infrared communication devices. The atron’s controller can also rely on the ASE framework [4]. ASE (Assemble and Animate) is a framework for modular robots developed in Mærsk Mc-Kinney Møller Institute. It provides many algorithms to be used in robots’ controllers. Such algorithms can be learning, locomotion or communication algorithms. 1.3 Control for modular robots As mentioned in section 1.1.1, modular robots may be controlled centrally, but a distributed control algorithm is preferable. Distributed algorithms make an intensive use of communications to make each different and independent node to cooperate with each other. As a consequence, inter-modules communications are a critical matter in modular robots. Many robots have different communication devices allowing them more or less efficient communication capabilities: Odin’s communications are based on a shared wired bus traversing all modules, which allows it high-speed and reliable communications. In addition, Odin is able to dynamically split the internal shared bus into several independent buses in order to reduce the number of modules sharing the same bus (see [27]). On the other hand, Atron modules are equipped with 8 infrared devices allowing them to communicate directly with all their immediate neighbors. However, infrared-based devices are generally more error-prone and slower. Also, since an atron’s communication devices only allows two modules to communicate with each other, an important part of Atron’s communications consists either in routing or in broadcasting. This constraint makes the atron’s controller to have more processing activity dedicated to communications. It may be possible to know in advance the network topology of a modular robot, but this property does not remains when the modular robot is reconfigurable. Dynamically reconfigurable robots make the problem even more difficult to handle. It is possible to design an early initialization phase in reconfigurable robot where modules build a routing table then start working. Since a dynamically reconfigurable robot may be reconfigured on line, this knowledge of network topology is not longer valid. However, it is still possible to consider a new network discovery phase after every reconfiguration. The problem is similar on self reconfigurable robots, except they are entitled to self-reconfigure much more fre7 quently, making harder such network discovery phases. Figure 1.5 shows different possible network topologies for an Atron robot. Figure 1.5: Multiple different configurations for a modular robot generates different networks. As a consequence, modular robots and more particularly dynamically reconfigurable robots need a communication mechanism able to handle such dynamics in the physical network. A node must be able to receive information from another node, for instance to get a value from a sensor only existing in this second node. It must be able to do so wherever is this node in the network. One way to achieve this goal consists in making modules having interesting data to let other interested modules to get this data whenever it is available. Receiving nodes may not have to know from what module the data comes from, as long as it is valuable for them. This communication approach named publish/subscribe communications, is introduced in section 1.4. 1.4 Publish/subscribe communication pattern Publish/subscribe communication pattern proposes an alternative to classic client/server model in networks. It provides an efficient way to spread information among interested nodes in a network, since these nodes don’t have to request this information each time they need it. Instead subscribers express once their interests, and receive a notification carrying data each time a publisher generates data matching this interest. This is an event-based communication scheme, where publishers fire events, and subscribers listen to them. This communication pattern allows the decoupling of subscribers and publishers: they don’t need to know each other. Instead, they need to know more or less roughly other nodes’ interests, depending on the publish/subscribe techniques involved in implementations. They can be useful in wide area networks such as the Internet. In fact, event-based communications exist in very common uses like RSS feeds, where a user uses an RSS reader to subscribe to a topic, then waits for news related to this RSS feed to be automatically delivered in its reader software. By nature, publish/subscribe communications support more flexible underlying networks than client/server scheme, thanks to their decoupling property. Thus they can be particularly efficient in wireless sensors networks, ubiquitous computing or more generally, in Mobile Ad-hoc Networks (MANET). They can provide a good solution for modular robots communications. 8 1.5 Related work Baldoni et al. [8] and Legatheaux Martins et al. [34] give surveys about publish/subscribe communications, different features it involves and different approaches investigated for each features. They also describe implementations in various projects such as Hermes [44], Medym [12] or Pastry [46]. Most of the documented solutions rely on an underlying network providing unicast, broadcast or multicast services. This underlying network usually lies on IP, TCP and/or UDP ([11, 12, 15, 29, 41, 51]). These solutions handle nodes’ mobility or failures, but assume there is always a working unicast, broadcast or multicast primitive available. This makes them to rely on a structured network overlay, which is not the case of modular robots. On the other hand, Sivaharan et al. [48] describe a gossip-based method on the top of mobile and unstructured networks. All communications rely on an unstructured peerto-peer layout, and communication primitives are provided exclusively by 802.11g. Also Hall et al. [30] give an adaptation of the routing table-based solution described in [14], for sensors networks. 1.6 Motivations and content The main goal of the work described in this document consists in the investigation, implementation and evaluation of different publish/subscribe communication mechanism. These event-based communication mechanisms must be suitable with the context of control for modular robots. This document gives in chapter 2 an overview of state-of-the-art publish/subscribe communications, then it presents in details in chapter 3 two different implementations for Atron modular robots. Another chapter introduces performances measurement and a comparative study between the two different implemented techniques. One is event flooding-based: it floods events through all available communication devices, whereas the second one uses subscriptions to build routing tables and forward in an efficient way all events fired in the network. This section shows better performances for routing table-based algorithms, and discusses about reasons but also the cost of such advantages. The documents ends with suggestions to lower the cost of routing table-based method’s good performances and concludes. 9 Chapter 2 An overview of publish/subscribe communications 2.1 Introduction Previous chapter 1 introduces modular robots, there characteristics as well as publish/subscribe pattern as a suitable solution to solve inter-module communications. This section sums up the introduction on publish/subscribe mechanism described in [6, 34]. A pub/sub system aims to deliver event notifications to nodes which have subscribed to them. Its performances can be measured in term of number of mistakes (events not delivered to all interested nodes, or event delivered to nodes which a are not interested in it) and communication consumptions. In publish/subscribe algorithms, there are several features to consider: • Underlying network protocol features • Pub/sub-level network infrastructure • Subscription model • Matching events with subscriptions • Subscription routing • Routing notifications These different aspects are explained in details in the following sections. 2.2 Underlying network protocol features As publish/subscribe systems are communication systems, they have to rely on one or several ways of communication with other nodes. These communication means are provided by the underlying network protocol, which can provide several features, having different capabilities and weaknesses. 2.2.1 Relying on classic network features Event-based communications based on transport level network stacks as TCP or UDP can provide an easy way to spread out subscriptions and notifications in a network, as it handles everything related to message routing and network resilience. TCP/UDP/IP is widely deployed and thus, it makes the pub/sub system to be easily deployable. On the other hand, 10 deploying such a system in a wide network as the Internet can raise numerous problems, like security policy and firewalls rules applied by network administrator. Beside in embedded systems, such a complete networking stack may not be implemented, because of the high memory and calculation power constraints imposed by such devices. Another convenient feature to rely on when designing publish/subscribe systems, is multicast capabilities. As it is likely to have one events publisher for several subscribers in event-based networks, it may drastically reduce network overhead to send one multicast message addressing dozens or hundreds recipients, instead of dozens or hundreds unicast TCP/UDP messages. However, if using multicast for topic-based subscribers delivery is easy, content-based multicast addressing is more challenging. The main problem lies in the number of groups of recipients. In topic-based subscriptions, the number of multicast groups equals the number of topics, which might be limited. In content-based subscriptions there are as much multicast groups as there are different subscriptions denoting a different subspace in notification space. As this number can get easily very high, the arising problem is the lack of multicast group to handle all these notification groups; a solution to this problem consists in grouping different subscriptions to similar multicast groups, while keeping wrong deliveries (deliveries to uninterested nodes or lack of deliveries to interested nodes) as low as possible. This is called the channelization problem, computationally hard problem described in [34] and for which approximate solutions are proposed in [1]. Also, IP multicast lacks a widespread deployment [22]. Baldoni and Virgillito [6] claim it is generally not considered as a feasible solution for application deployed over a WAN. 2.2.2 Mobile and sensors networks as underlying structure Underlying network infrastructure considers the network on top of which publish/subscribe communication system works, and what services and properties it guarantees. This is an important matter as different pub/sub systems do not exploit the same underlying communications: they can rely on neighbor-to-neighbor communications only, or on network multicasting capabilities, if any. Also, it has to deal with the network’s stability, especially if the physical devices are expected to move. There are mainly two approaches in publish/subscribe systems for mobile networks. In a first model, all nodes are considered to be able to move; these network are denoted as MANET for Mobile Ad-hoc Network [3, 31, 37, 48]. A second approach considers a subset of the nodes to form a stable network topology, from which other mobile nodes are never more than one-hop far from them, and roam. Mobility induces phenomenons such as nodes unavailability or temporary disconnections. Location-awareness is also an important matter in publish/subscribe systems, as a mobile nodes may not be interested in an event happening far away from its physical location. Baldoni and Virgillito [6] mention pub/sub communications can rely on a transport protocol built over 802.11b, or being 802.11b itself. Sensors networks can fit particularly well to publish/subscribe communication scheme, as many nodes collect data by sensing their environment, and a few others collect it. Sensors are events publisher while data collectors subscribe to data they are interested in, such as described in [33]. Therefore one can expect in a sensors network, to have many publishers and few subscribers. By nature, Wireless sensors sensors networks communicate in a broadcast manner, and are largely limited with power supply. Therefore, even if the network can be considered as fixed, the topology can change as nodes enter standby periods to save power, or fail. Hall et al. [30] and Costa et al. [18] describe publish/subscribe solutions for wireless sensors networks. 11 2.3 Pub/sub network topology The pub/sub-level network organizes the way every nodes (subscribers and publishers) communicate with each other. It can be seen as an higher-level network (a network on top of underlying network). The underlying network infrastructure directly influences the design of this layer: depending on the underlying layer’s properties like stability or communication routines, the pub/sub-level network can be designed to handle different properties which can affect its performances: scalability or resistance to instability in underlying network. Publish/subscribe networks can be designed in two different approaches. They can consist of brokers, which are responsible of event deliveries management and brokers’ clients, publishers or subscribers nodes. A second approach consists in a peer to peer decentralized network where all nodes can both fire and listen to events, and be responsible for their transportation in the network. 2.3.1 Broker overlays Broker overlay is an application-layer network lying on an efficient network protocol. This protocol typically provides methods for unicast, broadcast and/or multicast communications. It fits well to wide area, Internet-size networks. They are independent, cooperate to deliver events and subscriptions, and only need to be aware of a subset of subscriptions. Clients connect to brokers to notify an event (which is then routed between brokers) or to express them their subscriptions. If the network admits only one broker, then it is a centralized publish/subscribe system. A broker overlay is likely to be static: topology change are rare and happen either when a new broker joins the network, or when a broker experiences a failure. Brokers connect to the clients they know to deliver them event notifications. Clients can connect to the closest (from the point of view of the underlying network) broker they know, but they can also connect to other brokers, depending on the routing strategy and expected network behavior; see section 2.6.1. A broker topology can be either flat or hierarchical. In a flat topology, every broker can communicate with any other broker, whereas a hierarchical topology features several tree structures where subscribers are leaves and publishers are roots (or vice-versa); notifications are sent in only one direction in the tree. A flat topology allows brokers to evenly share their working load, but they have to have a greater knowledge of every subscriptions in the network so that to route a notification to the right next broker. On the other hand, tree structures allow brokers not to have to be that aware of subscriptions others brokers manage, but brokers at the top of the trees generally experience a heavier load than to ones closer to trees’ leaves. 2.3.2 Peer to peer structured overlays A peer to peer structured overlay for publish-subscribe system is a self-organized network where each node self-assigns a unique id which permits efficient broadcast, unicast and multicast communications among the nodes. The fixed structure makes sure each node id matches exactly one node in the system despite nodes arrival, leaves and failures. A peer to peer structured overlay handles topology changes better than broker-based publish/subscribe networks, and is thus more suited to unmanaged environments. Building such a peer-to-peer structured overlay allows to abstract self-reorganization’s consequences over the network. Events are routed only using underlying network capabilities. Many such systems have been implemented, based on unicast communications (Pastry[46]) or multicast[45], topic-based (Bayeux[56]) or content-based (Rebeca[52]). 12 2.3.3 Peer to peer unstructured overlay Unstructured overlay is an unmanaged topology, which aims to organize small networks despite failures and nodes leaves and arrivals. Unstructured overlays mainly use flooding, gossiping or random communications (see section 2.6) to spread out events and subscription. This is due to the lack of any structure facilitating message routing through the network, which representation in memory is difficult to keep up to date while the network constantly changes its topology. Unstructured overlays are probabilistic since there is always a possibility for an existing element to be not found in a search. This peer-to-peer unstructured overlay suits well to mobile networks and sensors networks. High dynamic in mobile networks makes difficult to maintain routing table, and/or this maintenance consumes too much calculation resources [31][3]. Typical event routing solutions rely on underlying p2p network, usually MAC network and exploits mobile and sensors networks characteristics such as broadcasting messages. 2.4 Subscription models Subscription model rules how a node expresses its interests to a particular event that could be fired. There are several subscription models among which topic-based and content-based models emerge. In topic-based subscription model, subscribers express their interests about a particular topic and receive all notifications related to this topic. A topic can be a proximity sensor value a publisher publishes about whenever it decides to. Each topic can be assimilated as a channel ideally connecting every publishers to every subscribers of this topic This is a coarse grain subscription model similar to RSS feeds on the Internet; it can eventually be fined through wildcard characters in topic names or topic hierarchies[34]: if topic A is parent of topics B and C (B and C are sibling), then subscribers of A receive events A, B and C. On the other hand, subscribers of B only receive events matching B, no event C and only a subset on events A. Content-based subscriptions model is a generalization of topic-based. It provides a much more fine correspondence between a node’s needs and subscriptions it expresses. Events are not classified according to a predefined criterion (topic name in topic-based models), but properties of events themselves. Subscriptions are a set of constraints regarding some attributes, which can be expressed as an equality (Ai = vi ) or belong-ship to a range (Ai ∈ [vmin ; vmax ] or Ai < vi ). A topic-based subscription is a content-based subscription consisting of only one equality constraint over an attribute describing the event’s name. Since content-based subscriptions involve several constraints overs attributes, several tests over attributes must be performed to check if an event matches a content-based subscription, or not. This increased complexity of event matching increases required calculation power to compute the set of interested nodes, and thus the event routing activity. Aguilera et al. [2] and Mülh et al. [40], chapter3 present efficient matching algorithm for contentbased systems. More details about content-based subscriptions and matching strategies for them are given in section 2.5. More subscription models exist, like type-based subscription model[23, 24], where events are objects belonging to a specific type. A coarse grain filtering can be achieved thank to objects’ type, as in topic-based models, and a finer grain filtering is made possible thanks to all object’s attributes and methods. Thus type-based subscription model represents a trade-of between topic and content-based systems. Concept-based systems Cilia [17] exploit ontologies capabilities to eliminate previous approaches’ need of a well known and shared knowledge about attributes features (type, size) and meaning. XML-based approaches brings XML structure to implement hierarchy in events. XML representation also allows interoperability but requires heavier processing [48]. Finally, Location-aware suits particularly to mobile environment, as nodes may be only interested in events happening in 13 proximity. Such approach require the pub/sub system to be able to track its nodes’ mobility. Location-aware systems are described in [20]. When a publisher wants to fire an event or when a node has to forward (route) any event it has received, it has to check if sending this event is meaningful. Sending an event through a channel is generally meaningful when it is known there are interested nodes in this event in the direction of this channel. The decision whether there are nodes interested in this event is tightly coupled with matching an event to a subscription: an event matches a subscription if the node which emitted the subscription is interested in the event. The matching algorithm depends directly on the subscription model and can be more or less complex. Finally, subscriptions and notifications routing along the publish/subscribe network consists in making as much relevant nodes as possible to be aware of a subscription or an event, using possibilities offered by the publish/subscribe network. This is the core of an event-based communication system. There are several routing strategies for subscriptions and events, and all can be very different. However, they can be classified as deterministic methods, or gossip-based methods. The former uses routing tables usually built from subscriptions while the latter spreads out events more or less in a random way. Both these approaches have there advantages and inconveniences. Clearly, the deterministic approach allows very good performances in term of delivery errors and use of network, but gossipbased methods present a much better tolerance to unstable underlying networks. 2.5 Matching Matching an event with a subscription is crucial to route efficiently events toward interested nodes. It allows routers to take a decision if they should convey an event toward a particular channel or not. Matching events with subscriptions is tightly coupled with the subscription model the publish/subscribe system uses (see section 2.4). In this section, the subscription model is assumed to be content-based, as this is one of the most described in publications, and is more general than topic-based subscription model. In content-based systems, a notification is a set consisting on equality relations between attributes names and values, among all n attributes the system allows. A notification e (also called “event”) can be noted {Ai = vi |∀i ∈ [1; n]} where vi is an exact value belonging to the type of attribute Ai , or the value “any”. “any” value means any value for this attribute is not relevant in this notification. An attribute not mentioned in an event means this attributes takes the “any” value. In other words, if attributes(e) gives the set of all attributes mentioned in a notification e, then ¬(Ai ∈ attributes(e)) ⇔ Ai = “any”. Let’s consider notification space E to be the space in which is included all notifications the system allows; E is a n-dimensional space, and a notification e with no “any” attribute is a single point of space E (e ∈ E). A content-based subscription is a set of constraints of equality, difference or inequality to a value, or belonging in a value range over all attributes. Again, an attribute Ai ’s constraint can be an equality to “any” (Ai = “any”), which means this attribute is not relevant in this subscription. In this case, this attribute doesn’t have to be mentioned. A subscription s describes a space included in notification space E (s ⊆ E). A subscription S1 is said to cover a subscription S2 if S2 is a sub-space of S1 or if S2 ⊆ S1 . Also, a subscription S1 / intercepts a subscription S2 if S1 ∩ S2 6= 0. A notification e matches a subscription s if it denotes a point in subspace s. In other word, if match(e, s) is true when e matches s and false when it doesn’t, then match(e, s) ⇔ e ∈ s. While matching a notification against a topic-based subscription is trivial (it requires to check an equality constraint over a simple attribute), matching an event against a contentbased subscription can be much more difficult, due to the higher number of constraints to check. Matching sequentially each notification’s all attributes against all subscriptions can easily lead to poor performances, and smarter algorithms are required. Aguilera et al. [2] 14 and Mülh et al. [40] give solutions to match events with subscriptions. They lie in building efficient filters from all known subscriptions then use these filters to match events. Kale et al. [32] provide a formal complexity analysis and comparison of matching algorithms. 2.6 Routing notifications Routing notification is the core concern of a publish/subscribe communication system. Baldoni et al. [8] and Legatheaux Martins et al. [34] give a formal specification of the event routing problem. It basically lies in conveying events through the network, toward all interested nodes and avoiding to reach nodes not interested in it. A routing algorithm aims to guide notifications through the network, in a most efficient manner regarding criterions described in section 4.2. Routing algorithms can be split in two main categories. One is based on deterministic event routing tables while the second is based on stochastic strategies. If the former is able to achieve a good level of performances (regarding section 4.2), it is also exposed to very bad results when event routing tables turn to be out of date. On the other hand, probabilistic methods do not provide 100% guaranties of any result, but whatever unstable the network is, it provides a high probability an event is actually delivered. 2.6.1 Events routing algorithms While several different algorithms exist to guide events in a network overlay, they can be classified in two main categories. First is algorithms based on deterministic methods, like multicasting, flooding and selective broadcasting. They are said deterministic because they need the nodes to keep a precise knowledge about the network, usually other nodes’ subscriptions and their location. This knowledge is acquired using deterministic method, and exploited by a deterministic algorithm to build and maintain a event routing table. Finally, this table deterministically guide events toward the right directions in the network. In stable underlying networks, deterministic algorithms can provide both correct and optimal solutions at the same time. However, it is important to keep in mind that in spite of the use of deterministic routing tables, deterministic algorithms does not imply a deterministic guarantee on deliveries: fired events may not be delivered to their subscribers (false negative), and may be delivered to uninterested nodes (false positive). This is true between the moment when the underlying network topology changes, and when the routing tables have been correctly updated. Beside them are probabilistic methods. Unlike deterministic algorithm, probabilitybased routing doesn’t guarantee any correct nor optimal solution, but it provides an low probability of false positives or false negative. On the other hand, probabilistic methods do not have to maintain routing tables, and are therefore more robust than deterministic algorithms regarding underlying network topology changes. Event flooding Regarding routing performance measurement described in section 4.2, an ideal solution concerning false negatives and memory overhead is event flooding. Event flooding consists simply for event publishers and routers to broadcast in each channel it can use, every event they have received or fired. In the former case, the event is not forwarded to the channel from which it has been received. This routing algorithm is simple to implement, does not reduce subscription expressiveness, and can stand as reference a when measuring performances. However, it is generally not used because of its poor scalability, since it generates a lot of false positives and network overhead. Another flooding method lies in subscription flooding. Every subscriber broadcasts its interest and its id, so that every node is aware of all subscriptions in the entire system. Thus 15 they can build a routing table to form a publish/subscribe-level network overlay. When publishers fire events, they reach only matching subscribers. It makes this algorithm to be both correct and optimal regarding notification routing. However, this method proved to be very costly when broadcasting subscriptions ([13, 47]), especially when they change at an high rate: every node has to broadcast all its subscriptions in the whole network, causing important network overhead, as event flooding does. Cao and Singh [12] present some work around subscription flooding. An optimal algorithm: Ideal multicast A notification n should be routed to all nodes subscribing this event, preferably by the shortest path. Since broadcasting consumes too much bandwidth and process power (to discard false positives in each node), another idealized and popular solution lies in underlying multicasting systems, using one multicasting trees for each different group of subscribers. Legatheaux Martins et al. [34], section III-A describes in formal details this idealized solution. Every node analyzes its own set of known subscriptions, and extracts a partition P of the union of subscriptions in such a way that every elements in partition P’s element p (p ∈ P) belongs to exactly the same set of subscriptions (equation 2.1). Since P is a partition of the union S of all subscriptions, then any event e belonging to at least one subscription of S, belongs to exactly one element p of P. Each p is mapped to a multicast group G p , which recipient set is given by equation 2.2 (where nodes(s) gives the set of all nodes interested in subscription s). An event belonging to p is forwarded to G p . ∀p ∈ P ∧ ∀e ∈ p ∧ ∀s1 , s2 ∈ S : e ∈ s1 ⇔ e ∈ s2 (a) Subscriptions S1 , S2 , S3 and S4 in notification space. They may intercept each other. (2.1) (b) Groups G1 G2 G3 G4 G13 G12 are formed from S1 , S2 , S3 and S4 such as every elements in every group belong to the same set of subscriptions. (schemes from [34]). Figure 2.1: Channelization problem: create as few groups as possible, every groups being as uniform as possible; uniformity is defined by how uniform is the interest of receivers within this group. / G p = {nodes(s)|∀s ∈ S ∧ s ∩ p 6= 0} (2.2) This method ensures there is no false positive nor false negative in event deliveries. Assuming the multicast network is optimal in term of network overhead, this algorithm it is both correct and optimal. This method also assumes every node knows every other nodes’ subscriptions; if S is the set of all subscriptions in the system, then each node requires O(|S|) space to store them. The algorithm assumes the multicast groups to be shared among all nodes in the system, thus each node must recompute multicast group as soon as they receive a new subscription. Also, this algorithm may require 2N multicast groups for 16 a system consisting of N nodes. As this grow very quickly with the number of nodes, it is necessary to merge enough groups to fit the underlying network requirements. Merging groups can induce false positives since events are forwarded to larger groups consisting of nodes not interested in the same topics. The problem of reducing the number of group while keeping as low as possible the level of induced false positive deliveries is known as channelization problem, a computational hard problem for which only approximate solutions are known [1]. An approximate solution to channelization problem consists in dividing notification space into cells. Each cell is mapped to a group of nodes, all nodes having at least one subscription intercepting the cell. Each cell is associated with exactly one multicast group, and each group is associated to at least once cell. A notification belonging to a cell is sent through the multicast group associated to this cell. Thus the number of required multicast groups is lower or equals the number of cells. However, as a consequence of decreasing the number of multicast group, this method creates false positives. When cells are contiguous and disjointed, they can be mapped to key computed from their characteristics, like their coordinates in notification space. In the same way, subscriptions can also be mapped to one or several keys matching the cells they intercept. Notifications can then be routed to keys denoting nodes interested in this event. It means that nodes are organized in a structured overlay, and notifications can be easily routed in that network. This method is similar to rendezvous-based routing (see below) since notifications and subscriptions meet in the network, striving the way computed from their own characteristics. It is also possible to partition the subscription space instead of notification space, when it is possible to group them by semantic similarity. Figure 2.2: The notification space is split in 4 cells, mapped with 3 different groups G134 (dark red), G12 (green) and G1 (light blue), according to cells and subscriptions interceptions. Shortest Path Spanning trees When multicast is not available, some variant of ideal multicast algorithm (section 2.6.1) are available. An alternative consists in building a Shortest Path Spanning Tree (SPT) with the node matching subscription and rooted at the event publisher. It consists in pruning an SPT rooted at the publisher and spanning all nodes, to the matching subscribers [10]. Such an SPT is built using similar techniques to link-state unicast and multicast routing [38, 39]. Selective routing Selective routing algorithms tend to reduce network overhead from flooding techniques, by both using routing table to reduce notification spread and reducing the amount of subscriptions each node has to memorize and forward. This is particularly efficient when only a small portion of the nodes is indeed interested in subscriptions, but consume slightly more memory and calculation resources. If a major part of nodes are interested in an event, event 17 flooding may be a realistic option since subscriptions has not to be flooded in the network, and nodes don’t have to memorize any routing table. Filtering-based In filtering based methods, notifications are routed only through path leading to interested nodes. Such paths are build using reverse path learning[21] based on subscriptions forwarding. Subscriptions are spread out in the network so that each node can build an event routing table; Also, subscriptions forwarding is bounded to nodes which routing table can be affected by this new subscription. Carzaniga et al. [14] describe in details this routing algorithm. Filtering-based routing algorithms provide a correct and optimal solution for events routing. Also, the nodes don’t have to be aware of all subscriptions in the whole system, but only their immediate neighbors’. Finally, they induce no limitation on subscription language. On the other hand, as the consequence of the need of other nodes’ interests, filtering-based routing strategies have to maintain a routing table and keep it consistent regardless of changes in other nodes’ interests and underlying network topology. This algorithm is explained in more details in section 3.6.2, about its implementation on Atron modular robot. Rendezvous-based routing These algorithms makes similar subscriptions and notifications to converge to the same node. This node is then responsible to forward notifications to subscribers it knows. More precisely, for a given subscription σ, a set of node SN(σ) named rendezvous nodes of σ are responsible to store σ and forward events matching it to all its subscribers. For all σ matching e and SN(σ), exist rendezvous nodes of e EN(e). EN(e) represents all nodes responsible to forward e to all its subscribers. SN(σ) and EN(e) can be obtained through a mapping function mapping a point in notification space with an id. Such a function can be equation 2.4, where x and y are values from attributes a1 and a2 in figure 2.3. SN(σ) is all ids calculated from all the points included in σ, and EN(e) is all ids obtained from the point e through the mapping function. This method assumes that among nodes responsible for a notification e, there is always at least one responsible for notification σ, where σ matches e (equation 2.3). In other words, when a publisher notify the set of nodes responsible for the event it fires, one of them must know what nodes are interested in this event This condition is given as mapping intersection rule[9]; meeting this requirement is non-trivial task. Rendezvous-based routing algorithm has been introduced in [53], and reused in systems such as Bayeux[56] and Hermes[44]. ∀e ∈ N ∧ ∀σ ⊆ N : e ∈ σ ⇒ SN(σ) ∩ EN(e) 6= 0/ (2.3) map(x, y) = 2 · (x − 1) + y (2.4) Figure 2.3: Notification space composed of two attributes a1 and a2 , both having numeric values. 18 Because several different nodes route different notifications, rendezvous-based routing allows a good calculation charge distribution charge among all nodes in the system. However, they don’t fit well dynamic: when a new subscription appears, the whole partition must be computed again and mapped again to nodes. Moreover, subscriptions handled by one node may be moved to another. Concerning fault tolerance, a mechanism must ensure a node can replace another falling node by forwarding all the subscriptions it was responsible for. Finally, rendezvous-based methods suffer from another significant drawback: the notification space must be composed of attributes from which it is easy to map a numeric value. While it is trivial with numbers or ordered discrete values, mapping a value from a string (for instance) can require a lot more calculations. Probabilistic algorithms Probabilistic algorithms are different from deterministic algorithms as they don’t rely on a deterministic routing table to forward subscriptions or notifications. This difference makes probabilistic algorithms less vulnerable to physical network topology changes, than deterministic algorithms can be, because of an synchronized routing table. Probabilistic algorithms are inspired in epidemic spreads to spread out information in a network; they are also called epidemic or gossip-based algorithms. They consists in forwarding pieces of information toward several randomly chosen neighbors, like a decease spreading in a given environment. Probabilistic algorithms work stepped by synchronized gossip rounds, when every pair exchange with a random set of its neighbors, about the the state of the global publish/subscribe system. In such a round, they exchange their knowledge about published events as well as other pairs’ subscriptions. Costa et al. [19] classify gossip communications along two criterions in two different dimensions: gossips can be push or pull style, and positives or negatives. In push gossip styles, nodes gossip periodically their knowledge about notifications and publications. On the other hand in pull style gossip, nodes solicits information from neighbors when they realize they needs extra knowledge to compensate losses. Gossips are said positive when messages contains the state of the publish/subscribe system as perceived by the message issuer, or negative when such messages contains required knowledge to compensate lack of awareness a node can have about the overall publish/subscribe system’s state. A gossip-based communication mechanism is often push/positive to spread the system state, or pull/negative to react when a message loss has been detected by a node; however, push/pull and positive/negative dimensions have, in principle, no influence on each other. Pull/negative pair is often preferred for its high degree of reactivity as well as its low communication rate. However in pull/negative pair, the time before a node realizes it misses some information can be long; in that case push/positive-based algorithms can be a better solution. Gossip-based algorithms present interesting features such as an even load distribution and a very good resilience to topology changes. Also, these algorithms are said to be simple to implement and inexpensive to run. Finally, probabilistic algorithms have a good scalability, as they keep their properties while the global network counts more nodes. Event routing for MANETs and wireless sensors networks Publish/subscribe implementations for MANETS can rely on both deterministic and probabilistic methods. Such systems, usually embedded, are highly constrained by energysaving concerns. This constraint often lead their designers to use broadcast communications, cheaper than unicast in a wireless network context, and design the publish/subscribe algorithms over a data link layer (802.11X) or classical MANET routing protocol such as MAODV ([55]). Anceaume et al. [3], Huang and Garcia-Molina [31] and Picco et al. [43] describe different algorithms for building and maintaining tree event routing structure in 19 MANETs over a data link layer. Picco et al. [43] present an algorithm for restoring the event routing table after a disconnection in a loop-free topology within a mobile ad-hoc network. It features a separation of concerns between tree connectivity layer and event forwarding layer. These deterministic techniques shows their limitations when coping with dynamic networks: when a change happens in the physical topology, it takes time to update routing tables; Within the period between the topology changes and routing tables are updated, events may be forwarded in the wrong direction, and create false positives and false negatives. Baldoni et al. [7] and Baehni et al. [5] present structureless algorithms: they don’t maintain any deterministic data structure on the topology. They are based on flooding or gossip. More particularly, Baldoni et al. [7] introduce an informed flooding technique based on euclidean distance to direct an event to its destinations. Event routing for sensors networks Hall et al. [30] and Costa et al. [18] propose adaptations for sensors networks from event routing for wired networks. Hall et al. [30] modify the work from [42] which rely only on broadcast communications. It provides specific solutions to reduce packet collisions. Also, Hall et al. [30] adapt [14] for sensors networks by extending the acyclic overlay network with backup routes for handling permanent and transient failures. 2.7 Conclusion The publish/subscribe approach in communications involves a lot of different features. First of all, the physical network on the top of which the publish/subscribe mechanism must be implemented: depending on its structure, its stability or the communication methods it offers (unicast, broadcast or multicast), the solution can be very different. Then the publish/subscribe algorithm can involve different nodes and connectivity, thus design its own network. This network can be brocker-based or a structured or unstructured peerto-peer overlay. This publish/subscribe-level network depends mostly on the underlying network characteristics. Another important feature to consider in a publish/subscribe system is its subscription model: among a large variety of subscriptions models, topic-based and content-based models represent a large part of investigations. The former model is based on a discrete and bounded topic set subscribers can subscribe to, and publisher publish about. This is a common scheme for publish/subscribe system, like RSS feeds through Internet. It is very easy both to implement and run, but remains very limited in its expressiveness. The latter model, content-based model, is a generalization of topic-based model: it offers a much wider expressiveness (based on continuous unbounded “topic” set) but is much more difficult both to implement and run. It is difficult to implement because of the need to design a subscription language comparable to SQL, and difficult to run because of the need of matching events with subscriptions, and the channelization problem to forward events through suitable multicast trees (see [34]). Among these different characteristics, there are three main different categories of publish/subscribe algorithms. The first category rely on flooding events or subscriptions, a second one is based on building routing tables thanks to subscriptions so that to forward efficiently notifications toward the matching subscribers, and the last category is based on epidemic or gossip algorithm to spread out events in a random-based way. These three main strategies have different advantages: flooding-based methods are very simple, routing tables-based algorithms are efficient and gossip-based are reliable through dynamic network. On the other hand, flooding techniques are not scalable, routing table-based strategy hardly support dynamic network and gossip-based algorithms cannot guarantee all events are always delivered to all its subscribers. 20 Networks on dynamically reconfigurable robots are typically peer-to-peer in the sens that messages are forwarded from module to module instead of circulating on a dedicated and shared bus. An exception is Odin modular robot [28] which is able to provide such a structure. They also don’t provide any unicast, multicast nor broadcast communication primitive but only a way to send a message to immediate neighbors through all different available channels. Also, their communication device may not be reliable and transmitted messages may be corrupted. Also, modular robots can be pack, herd or swarm (see section 1.1.1). Thus they need publish/subscribe algorithms able to cope with unstructured peerto-peer underlying network, and handle their errors. Such algorithms must be suitable for pack robots (where flooding-based methods may show better results), swarm robots (gossip-based seems more suitable) or herd robots, where routing table-based algorithm may be a suitable pack-swarm intermediary solution. 21 Chapter 3 Implementing publish/subscribe communications on Atron 3.1 Introduction This section introduces the implementation of two publish/subscribe algorithms for the modular robot Atron. The algorithms are integrated into ASE (see section 1.2). All publish/subscribe algorithms described here are based on different routing techniques, but they all have to run on a similar underlying network topology. In addition as a result of Atron’s self-reconfiguration capability, they also have to cope with highly dynamic network connectivity. The following sections argue about the choices made in publish/subscribe implementations. They explain the selection of publish/subscribe algorithms, the subscription model as well as details such as how to handle network cycles. Everything explained here stick to general working conditions. However, some Atron-specific features may induce more particular properties making possible simplifications and/or improvements; they are also described in this section. 3.2 Underlying network and suitable pub/sub algorithms All implemented algorithms run on the modular robot Atron. Atron presents some particularities due either to the modular robot category it belongs to, or to its own particular characteristics. Here is a list of relevant Atron’s attributes willing to affect the nature of the underlying network. • self-reconfigurable: the underlying network topology might change several times while the robot is functioning. • Pack or Herd robot: the number of modules is typically limited to a few dozens. Atron robots may be composed of hundred modules but are unlikely willing to reach several thousands of modules. • Cheap and infrared-based communication devices: cheap communication devices may be slow and exposed to errors or packets losses due to light interferences. • Peer-to-peer only communications: every channel available on each module may reach at most one other module, or no module at all. Nothing, hardware or software, allows to directly link two modules if these two modules are not immediate neighbors. There is no multicast method available. 22 The self-reconfigurable nature of Atron represents its most restrictive feature. This constraint makes the network on an Atron-based robot belonging to Mobile Ad-hoc Networks (MANET). On the other hand, the unstructured, peer-to-peer nature of communications in Atron robots represents a strong orientation toward flooding or gossiping-based algorithm. Finally, Atron being equipped with error-prone communication devices, and entitled to form herd-type robot, random-based or gossip algorithm might be the best alternative to allow modules to communicate. However, Atron robots often form pack robots consisting of only a couple of dozen modules. Pack robots need a tighter synchronization and can hardly rely on modules redundancy to handle errors due to failure or algorithms randomness. Therefore, a deterministic algorithm may also be necessary. As a consequence, two algorithms are implemented for Atron robot. The first one, event flooding remains the simplest and mainly stands for comparison purpose (see section 2.6.1). The second is a deterministic algorithm needed to handle pack robots composed of a limited number of modules; this implementation is based on filtering-based strategies (section 2.6.1). A third possible algorithm is intended to herd robots, able to handle failures thanks to numerous modules allowing redundancy. It is a gossip-based publish/subscribe algorithm as introduced in section 2.6.1; this last algorithm is not implemented. All algorithms implemented in this work share a common programming interface. A common interface for several different publish/subscribe algorithms makes sens, as these different algorithms provide the same service: subscribe to events and publish them. Keeping a common interface allows to provide users a similar front-end, and use these services a similar way whatever is the underlying algorithm. This property also eases the replacement of a given publish/subscribe algorithm in any robot’s controller. Finally, the stable universal publish/subscribe interface structures the design of new algorithms, making such development easier. Figure 3.6 illustrates how the implementation is structured. 3.3 Software architecture Deployment diagram shown in figure 3.1 represents the organization of all software elements available in a module. This module is part of a modular robot using a publish/subscribe system, and all modules in the robot are identical. A modules runs a controller encapsulating ASE framework. ASE provides the publish/subscribe system composed of a subscription model, a publish/subscribe algorithm and distributed barriers. 3.3.1 Algorithm synchronization with a distributed barrier Synchronization makes sure all modules in the modular robot wait for each other before they start to run the algorithm. This guarantees there is no module still initializing and unable to take part of the routing algorithm when starting the robot. Synchronization aims the same purpose concerning algorithm reinitialization. It uses a distributed barrier, which is explained below. When starting, a node initializes its publish/subscribe algorithm, then runs the distributed barrier once. When crossing the first barrier, it starts to listen incoming messages related to the pub/sub algorithm and run a distributed barrier a second time. When this second barrier is crossed, it starts to output publish/subscribe algorithm-related messages. This double barrier ensures that no module outputs a message before all other modules are actually ready to receive them. The double barrier may be avoided when starting the robot, but it turns out to be important to handle reinitialization any node can ask at any moment. Figure 3.2 illustrates the way of functioning of a distributed barrier. Every node keeps in memory the state it knows about all nodes of the robot. It consists for every individual node n, of sending to all neighbors, their own knowledge about the state of all modules in the whole robot. On the beginning, this knowledge simply states that node n itself only is known to be ready. When node n receives a message, the knowledge it carries is merged 23 Figure 3.1: Deployment diagram of the publish/subscribe system on an atron robot. with the previous knowledge n had. If this incoming message brings any valuable new information, then n broadcasts again its new knowledge. When a node n knows that all modules have started, then it crosses the barrier and the algorithm is finished. This algorithm works assuming every node knows in advance how many nodes the complete robot is composed of. Notice when the algorithm is stopped, it is still possible to receive different incomplete knowledge messages from neighbors. 3.3.2 Publish/subscribe algorithms Flooding algorithm Interactions between nodes with flooding publish/subscribe algorithm is very simple: figure 3.3 shows a simple event flooding scenario where the underlying network is composed of a cycle, and only one node has subscribed to an event. This extreme simplicity induces several problems. Figure 3.3 shows that events are spread out through all channels in the network. Most of them are useless to route this event only to its subscribers. Network cycles are also a problem that requires duplicate detection, as discussed in section 3.6.1. Filtering algorithm Consider the network in figure 3.3. The event filtering algorithm works as the scenario shown in figure 3.5: Initialization (phase 1) In initialization phase, the algorithms builds a spanning tree; this phase is explained below and illustrated in the first phase of figure 3.5. It aims to change the network at the top of figure 3.4 to the one in the middle. Here the spanning tree breaks the loop by ignoring the link between nodes 5 and 6 in the network shown in figure 3.3. Subscriptions (phases 2 and 3) When a spanning tree is available, the efficient broadcast method is used to propagate node 4’s subscription to all other nodes (message sub a). Node 1’ subscription 24 Figure 3.2: An example of a distributed barrier Figure 3.3: 0 subscribes to A, then 5 fires e with e ∈ A. Event flooding method induces several duplicates and false positives. 25 Figure 3.4: Event-filtering algorithm prunes efficiently the network connectivity so that the broadcasted events are forwarded only toward interested nodes is propagated through the same method but this subscription is filtered by node 4, according to the method explained in section 3.6.2. These subscriptions propagation allows the pruning of the spanning tree to obtain the bottom topology in figure 3.4, dynamically for each event fired in the system. Event propagation (phase 4) The fourth phase shown in figure 3.5 shows the propagation of an event matching subscription b through the network, using the pruned spanning tree (message event e). Unsubscriptions (phase 5 and 6; optional) Fifth phase illustrates the propagation of unsubscription from b issued by node 1 (message unsub b). Again in accordance to the algorithm described in section 3.6.2, this unsubscription is filtered. The last phase illustrates the node 4’s unsubscription from a (message unsub a). This unsubscription is not filtered and thus, is broadcasted through the entire system. Building a spanning tree There are several ways to calculate a spanning tree. Gallager et al. [26] propose a distributed algorithm to calculate a minimal-weight spanning tree. Malek et al. [36] also propose an algorithm to calculate spanning trees and reduce the messages overhead. It argues the method proposed by Gallager et al. [26] induce too much communications and is not suitable to energy-constrained systems like sensors network. The method described in [36] features the calculation of an “almost” minimal spanning trees, an energy constraints-friendly algorithm and the possibility to repair the spanning tree in case of failing or joining nodes. The method implemented in filtering-based algorithm to calculate a spanning tree remains very simple. It is represented in the first part of figure 3.5. The process relies on nodes pushing spanning tree requests (spt_req on figure) as well as marking nodes and their channels. A channel can be marked as being part or not of the spanning tree, and marked as being evaluated or not about its belonging to the spanning tree. The algorithm works as follow: • A unique node initiates the spanning tree calculation. It marks itself as being already integrated to the spanning tree and broadcasts through all its available channels a 26 spanning tree calculation request (spt_req in figure 3.5), where the root is the node itself. • All nodes receiving a spanning tree calculation request check their own state: if they are marked as being already part of the spanning tree, they immediately reply a negative answer (message refuse). Otherwise, they mark the channel from which the spanning tree calculation request was received from as being evaluated and part of the spanning tree, mark themselves as being integrated in the spanning tree, and send a spanning tree calculation request through all channels not already marked. If no marked channel is available, then is sends back a positive answer (message accept). • When it receives an answer, a node marks the channel through which the answer was received as being evaluated. If the answer was positive, then it marks this channel as being part of the spanning tree. Otherwise it does nothing. When all channels are evaluated, a positive answer (message accept) is sent through the channel from which a spanning tree calculation request was received and the algorithm stops for this node. When the algorithm ends, all nodes run a distributed barrier to make sure they don’t start to route subscriptions and notifications before all other nodes are also ready to forward them in the right direction. 3.4 Implementation details Figure 3.6 represents the static relationships between all elements available in the publish/subscribe system. Their behavior is described in the previous section and this section describes their implementation. Notice the existence of two classes implementing the event_core interface. They provide implementations for both event flooding and filtering algorithms. Similarly, single_attribute class implements a subscription model. 3.4.1 PublishSubscribe class: the publisher/subscribe interface to user This section introduces the front-end interface of publish/subscribe algorithm, as to be used in a modular robot’s controller. This is intended to programmers who want to take profit of publish/subscribe system in their work. It provides all the methods required to interact with a publish/subscribe system, such as firing events or expressing a subscription through the network. This implementation relies on the Event_core interface (described in section 3.6). It is interesting to notice that nowhere in this class are defined the data structures event, subscription, unsubscription and internal. These data structures and their methods of manipulation (initialization, serialization or deserialization) are highly dependent on the underlying subscription model thus they are separately defined in each different implementations. The role of this class also includes distributed synchronization: it makes all nodes in the network to start the algorithm at the same time, using a distributed barrier described in section 3.3.1. This class also handles the incoming messages reception, transforms them to data structures thanks to methods provided by the Event_core interface before transferring them to the actual publish/subscribe algorithm through the same interface. Available methods void e v e n t _ i n i t i a l i z e ( ) ; 27 Figure 3.5: Interactions between nodes when subscribing, unsubscribing and firing events through the network (network as shown in figure 3.3). 28 Controller PublishSubscribe event_publish(event): int event_subscribe(sub, event_handler): int event_unsubscribe(sub): int event_wait(): int event_reset(): int Barrier sub: class notif: class barrier_set(barrier, unsigned char): void barrier_ready(barrier): int barrier_merge(barrier, barrier): int <<interface>> Event_core event: class subscription: class unsubscription: class internal: class event_core_reset(): int event_core_publish(event): int event_core_subscribe(sub, event_handler): int event_core_unsubscribe(sub): int event_core_receive(event, char) event_core_sub_receive(subscription, char): int event_core_unsub_receive(unsubscription, char): int event_core_internal_receive(internal, char): int event_core_internal_send(internal, char): int event_core_wait(): int Event_filtering event: class subscription: class unsubscription: class internal: class Event_flooding event: class subscription: class unsubscription: class internal: class <<interface>> SubscriptionModel sub: class notif: class subscription_union(sub, sub): sub subscription_inter(sub, sub): sub subscription_minus(sub, sub): sub subscription_covers(sub, sub): char subscription_match(sub, notif): char subscription_null(sub): char Single_attribute Figure 3.6: Class diagram the implementation. Note that exactly one publish/subscribe algorithm and subscription model can be chosen at compile-time. 29 Gets the publish/subscribe algorithm to be initialized. Should be used only at module initialization time. i n t event_publish ( event ) ; Publishes the given event data structure among the network. i n t e v e n t _ s u b s c r i b e ( sub , e v e n t _ h a n d l e r ) ; Spreads out a subscription in the network. Also associates an arbitrary function to the reception of events matching this subscription. i n t e v e n t _ u n s u b s c r i b e ( sub ) ; Spreads out a unsubscription through the network, and dissociates the function from events matching this subscription. void event_display ( event ) ; Prints out on terminal a string representation of the event. int event_wait ( ) ; Stops the current thread until the publish/subscribe algorithm is fully initialized. This function behaves and returns the same value as sem_wait(sem_t∗). Take care to call it or call event_trywait() only once, as a second call without a previous reinitialization results in an endless thread pause. int event_trywait ( ) ; Tests if the algorithm is fully initialized. If yes, returns 0; otherwise returns another value. The return value is identical to the methods sem_trywait(sem_t∗). This function returns 0 only once after the algorithm has been initialized. Any other attempt to call this method or event_wait() without a previous reinitialization results in a failure (returned value different from 0). int event_reset (); Initiates a global publish/subscribe algorithm reinitialization, in all modules composing the modular robot. 30 3.4.2 Subscription model interface This interface provides all the services an implementation for a subscription model should provide. This includes arithmetic operations on sets such as intersections or inclusions, as well as publish/subscribe-specific operations such as subscription_match(noti f , sub). c h a r s u b s c r i p t i o n _ n u l l ( sub s ) ; This function must return true if the subscription is null. A subscription is null when both conditions described in equations 3.1 and 3.2 are satisfied. These conditions are equivalent. ∀n ∈ N : ¬match(s, n) s ∩ N = 0/ (3.1) (3.2) sub s u b s c r i p t i o n _ i n i t i a l i z e ( ) ; The implementation of this function returns a new subscription. This new subscription is a null subscription according to conditions described in equations 3.1 and 3.2. A subscription returned by this function is always evaluated as null. s u b s u b s c r i p t i o n _ u n i o n ( s u b s1 , s u b s 2 ) ; This function returns a subscription matching all interests of s1 as well as all interests of s2 , as described in equation 3.3. union(s1 , s2 ) = s1 ∪ s2 (3.3) s u b s u b s c r i p t i o n _ i n t e r ( s u b s1 , s u b s 2 ) ; The intersection operation between two subscriptions s1 and s2 allows to get a subscription matching all interests s1 and s2 have in common. This operation is described in equation 3.4. If the output subscription is not null, then s1 and s2 are said to intercept each other. inter(s1 , s2 ) = s1 ∩ s2 31 (3.4) s u b s u b s c r i p t i o n _ m i n u s ( s u b s1 , s u b s 2 ) ; The minus operation between s1 and s2 results in a subscription with all subscriptions from s1 except the ones also in s2 . This operation is showed more formally in 3.5 minus(s1 , s2 ) = {x|x ∈ s1 ∧ ¬(x ∈ s2 )} (3.5) c h a r s u b s c r i p t i o n _ c o v e r s ( s u b s1 , s u b s 2 ) ; This function implements the covering property of a subscription over another one, as shown in equation 3.6. A subscription s1 is said to cover a subscription s2 if all notifications in s2 are also in s1 ∀n ∈ N : n ∈ s2 ⇒ n ∈ s1 (3.6) c h a r s u b s c r i p t i o n _ m a t c h ( sub , n o t i f ) ; This function allows to know if a notification matches a subscription. This is a central function in content-based publish/subscribe systems, as it allows the algorithms to decide whether or not they have to accept a notification, and through which channel they have to forward it. A notification matches a subscription if the point it defines in the notification space, belongs to the subscription’s subset; equation 3.7 gives a more formal definition of what property this function allows to know. match(s, n) = n ∈ s (3.7) c h a r s u b s c r i p t i o n _ s u b _ s e r i a l i z e ( sub , c h a r ∗ ) ; char s u b s c r i p t i o n _ n o t i f _ s e r i a l i z e ( notif , char ∗); This allows to turn a subscription or a notification data structure into a sequence of bytes describing this data structure. Such a sequence is then sent through the robot’s communication channels. 32 Deserialization c h a r s u b s c r i p t i o n _ s u b _ d e s e r i a l i z e ( sub ∗ , c h a r ∗ ) ; char s u b s c r i p t i o n _ n o t i f _ d e s e r i a l i z e ( n o t i f ∗ , char ∗); Turns a sequence of bytes into a subscription or a notification. This is the exact reverse operation as serialization. void s u b s c r i p t i o n _ s u b _ d i s p l a y ( sub ) ; void s u b s c r i p t i o n _ n o t i f _ d i s p l a y ( n o t i f ) ; Implements a representation of subscriptions or unsubscription to be displayed on the terminal, if any. 3.4.3 Single attribute implementation Publish/subscribe communications offer various subscription models among which topicbased and content-based represent popular implementations. Topic-based presents a very simple mechanism, easy to implement and run. On the other hand content-based systems are much more powerful, as they allow a very wide expressiveness in fine interests for some particular events. The subscription model implemented in this class represents a trade-off between topicbased and content-based systems. It aims to take profit of topic-based simplicity, while keeping content-based’s more general scheme. In other words, it eases the implementation but leaves the possibility to switch to a more complete model in the future. It consists in a content-based system having only one attribute. In that sens, it is exactly equivalent to topic-based systems. However, it behaves as a content-based system, since notifications and subscriptions are not represented by a topic but a notification subspace. This subscription model is called “single attribute subscription model”. This section defines the general elements provided by the single attribute subscription model. No confusion must be made between a subscription (sub class) which describes a notification subset and a subscription message (subscription class), the content of a message sent between nodes to express a subscription. The latter differs for every different publish/subscribe implementations, which provide their own specific additional features. Elements in single attribute subscription model The following sections give a common basis for the subscription model used in all algorithms implemented in this work. It provides some information about how different event-based basic features described in sections 2.4 and 2.5 are implemented in this work. However, this is only a global description and different algorithms may use additional data structures they require to work properly. These algorithm-specific features are described in sections dedicated to these algorithms. Notification space Content-based systems are all based on a notification space. In topicbased systems, this notification space is the set of all topics available in the system. So is the notification space in the single attribute subscription model. By analogy to topicbased systems, every element in notification space can be assimilated to a topic. In the implementation, notification space is represented by a 32 bits integer, where each bit is one particular topic. 33 sub i n i t i a l i z e ( ) { return 0; } Figure 3.7: C code for subscription initialization in single attribute subscription model sub n u l l ( sub s ) { r e t u r n s 1 == 0 ; } Figure 3.8: C code to check if a subscription is null in single attribute subscription model Subscriptions Subscriptions express a node’s a particular interested within a certain subspace of notification space. It can be a single point, a subspace or even the whole notification space. In single attribute-based implementation, a subscription is also represented with a 32 bits integer, where the subspace of interest is expressed through all bits taking the value 1. Notifications In content-based systems, a notification represents a single point in notification space. Thus the same goes for the single attribute-based implementation: a notification represents a single bit in the 32-bits integer of notification space. It is expressed thanks to a number from 0 to 31 denoting exactly one bit of the notification space. As in topicbased subscription model, a notification in single attribute subscription model does not carry very much valuable information. Instead, the actual payload of a notification must be joined separately. This feature depends on the publish/subscribe algorithm implemented, and is introduced in the descriptions of different implemented algorithms. Implementation of single-attribute model’s operations This section introduces how the single-attribute model implements different operations required by the subscription model interface. This includes usual data structure management but also content-based specific manipulations with subscription space and subspace. Subscription initialization A newly initialized subscription is a subscription that doesn’t match any event. In single-attribute subscription model, it consists in a subscription which all 32 bits are 0 so that the match operation always returns false. Null subscription A null subscription is a subscription that doesn’t match any event. A newly initialized subscription is null. In single-attribute subscription model, a null subscription is a subscription which all 32 bits equal 0. Subscription union The union of two subscriptions results in another subscription that handles all interests from both operand subscriptions. In other words in single-attribute subscription model, such a union is obtained thanks to a logical OR operation. The code given in figure 3.9 gives an example of implementation. 34 s u b u n i o n ( s u b s1 , s u b s 2 ) { r e t u r n s1 | s2 ; } Figure 3.9: C code for subscription union in single attribute subscription model s u b i n t e r ( s u b s1 , s u b s 2 ) { r e t u r n s1 & s2 ; } Figure 3.10: C code for subscriptions intersection in single attribute subscription model Subscription intersection The intersection of two subscriptions forms a subscription which expresses all interests in both operand subscriptions. In single attribute subscription model, this is a logical AND operation as C code shown if figure 3.10. Subscription difference The difference between two subscriptions results in a subscription having all interests of the first operand subscription, except the ones also in the second subscription. This operation is achieved thanks to the logical equation 3.11. Subscription coverage A subscription s1 covers a subscription s2 if all interests expressed in s2 are also expressed in s1 In single attribute, this means that all bits at 1 in s2 must also be at 1 is s1 . In figure 3.12, the comparison to −1 allows to make sure all bits are indeed 1 (a numeric c variable which all bits are 1, equals −1 due to low-level representation of negative numbers). Matching notifications A notification matches a subscription if the point it expresses is included in subscription subspace. This is defined in single attribute implementation, as the bit in a 32-bits word denoted by the notification, to be 1 in the subscription. 3.5 Barrier: distributed synchronization among nodes Nodes receive, manipulate and send messages called “barriers” to exchange knowledge about what other nodes in the network are ready to work. A node keeps this message and manipulate it thanks to different methods and incoming messages. These methods allows the node to mark itself as ready, assimilate the knowledge carried by an incoming message as well as to check from its own knowledge, if all other nodes are ready. s u b minus ( s u b s1 , s u b s 2 ) { r e t u r n s1 & ~s2 ; } Figure 3.11: C code for subscription difference in single attribute subscription model 35 c h a r c o v e r s ( s u b s1 , s u b s 2 ) { r e t u r n ( s 1 | ~ s 2 ) == −1; } Figure 3.12: C code for subscriptions’ covering property in single attribute subscription model c h a r match ( s u b s , n o t i f n ) { r e t u r n s & ( i n t ) ( 1 << n ) ; } Figure 3.13: C code for matching notifications against subscriptions in single attribute subscription model void b a r r i e r _ s e t ( b a r r i e r , unsigned char ) ; Marks the node which id is given, as ready in the given barrier. int barrier_ready ( barrier ); Checks whether or not all nodes are ready, according to the knowledge carried by the given barrier message. int barrier_merge ( barrier , barrier ); Merges the second barrier into the first one. Typically the first barrier parameter is the node’s internal knowledge and the second is a message received from a neighbor. Eventually after several merges, the first barrier would be tested as ready according to the previous method barrier_ready(barrier);. 3.6 Event_core class: interface for algorithms’ implementations The event_core class provides all methods a publish/subscribe implementation needs to provide so that it can be used by the PublishSubscribe class (see figure3.6). These methods include publish/subscribe-related methods such as publish and subscribe methods as well as methods handling the reception of events, subscriptions, unsubscriptions and internal messages. This interface also defines methods for initialization, reinitialization and local barrier used to stop the controller until the algorithm is ready to work. int event_core_initialize (); Initializes the algorithm. This is run only once and only during module’s initialization. This function is never called again when the algorithm in reinitialized. 36 int event_core_reinitialize (); Reinitializes the algorithm. This function is run immediately after event_core_initialized() but can also be called again when a module receives a reinitialization signal. This function is called before any message related to algorithm’s reinitialization can be received. It may be used to set the module in such a state that it can answer consistently to other modules communicating with it for their own reinitialization. int event_core_reset (); Reinitializes the algorithm at the initialize of this module. This method is systematically called after event_core_reinitialize() but not necessarily right after. Here may be the right place to initiate a conversation with neighbors modules, if needed in reinitialization. i n t event_core_publish ( event ) ; Manages to spread out an event through the network. The publish/subscribe implementation may fill or alter the event data structure. i n t e v e n t _ c o r e _ s u b s c r i b e ( sub , e v e n t _ h a n d l e r ) ; Manages to spread out a subscription through the network, and associates the subscription to a function which will be run each time a matching event is received. i n t e v e n t _ c o r e _ u n s u b s c r i b e ( sub ) ; Manages to spread out an unsubscription through the network, and dissociates the function previously associated to it. i n t e v e n t _ c o r e _ r e c e i v e ( event , char ) ; This function is triggered when an event is received. This is the right place to match the received event with a local or a distant subscription, run the appropriate associated functions or/and forward it through the network. i n t event_core_sub_receive ( subscription , char ) ; Run when a subscription is received. This function may be assimilate the subscription in a routing table, then forward it according to the algorithm. i n t event_core_unsub_receive ( unsubscription , char ) ; Handles the reception of an unsubscription. Its role may be similar to event_sub_receive() but it handles unsubscriptions instead of subscriptions. 37 event e v e n t_c o r e _d e s e r ial ize ( char ∗ , i n t ) ; Deserializes an event from its byte array form, to an event data structure. This interface does not require the complementary function, though it is more than likely to be necessary. s u b s c r i p t i o n event_core_sub_deserialize ( char ∗); Deserializes a subscription from its byte array form, to a subscription data structure. This interface does not require the complementary function, though it is more than likely to be necessary. unsubscription event_core_unsub_deserialize ( char ∗); Deserializes an unsubscription from its byte array form, to an unsubscription data structure. This interface does not require the complementary function, though it is more than likely to be necessary. i n t e r n a l e v e n t _ c o r e _ i n t e r n a l _ d e s e r i a l i z e ( char ∗); Deserializes an internal communication packet from its byte array form, to an internal communication packet data structure. An internal packet in used in communications not directly used to publish, subscribe or unsubscribe, but may be useful to initialize the algorithm. Event filtering algorithm use these internal packets to build a spanning tree. This interface does not require the complementary function, though it is more than likely to be necessary. i n t event_core_internal_receive ( internal , char ) ; Run when an internal message is received. i n t event_core_internal_send ( internal , char ) ; Allows the sending of an internal message through a channel. int event_core_wait ( ) ; Stops the calling thread until the algorithm has finished to be initialized or reinitialized. Implementing this function thanks to a semaphore is a good idea. This function should return the same value as sem_wait(); int event_core_trywait ( ) ; Tests if the algorithm is initialized and ready to work. Returns a positive answer only once after it has been started or restarted. The returned value must be similar to the sem_trywait(), and could also be implemented with a semaphore. 38 Figure 3.14: Packet broadcasted in a cyclic network unable to detect duplicates may be forwarded indefinitely: step 6 is similar to step 3. After step 6, the network takes the state shown in step 4. void event_core_display ( event ) ; Displays an event in a text-based representation on the terminal. 3.6.1 Event flooding algorithm This section explains the implementation of the flooding algorithm, represented in figure 3.6 as the class “Event_flooding”. Event flooding has been briefly introduced in section 2.6.1. This simple algorithm consists for publishers in firing every events through all available channels. Subscribers filter incoming notifications according to their own interests. All nodes in between forward every notifications they receive through all available channels but the one the notification was received from. It is obvious that this strategy generates a big amount of network traffic, and a significant part of this traffic is useless. However event-flooding may presents several interests: its simplicity makes it robust and less errorprone; therefore it reduces the chances of software failure. Also, since it does not require to store other nodes’ interests, the amount of memory it needs to work is rather low. As this algorithm is very simple, there are issues to handle underlying network characteristics. The following section introduces how this implementation handles network cycles when managing communications. Cycles management Cycles in network topologies introduce more complexity in message broadcasting. The phenomenon illustrated in figure 3.14 makes broadcasted messages to be indefinitely repeated through the network. In structured networks, this problem is fixed by detecting duplicated packets. Every packets received from another channel than the one in the shortest path leading to its issuer, is considered as a duplicate ([50, page 371]). However the underlying network in Atron does not provide this information. Another way to detect duplicate packets consists in making nodes able to recognize a packet it has already received. However, storing and comparing every received message may induce several issues: • The required memory is significant and grows as long as new messages are received. It is impossible to store and recognize new packets once the memory is saturated. • Comparing exhaustively all received packets with all stored packets may be a long and difficult task. It may make the overall performances to drastically drop. 39 Figure 3.15: An example of the packet memory list. As a consequence, storing and comparing every packets received is not a realistic solution. This is especially right in modular robot: embedded software and tight hardware constraints result in a small amount of available memory and poor processing capabilities. A practical solution to duplicate detection A solution to duplicate detection is based on the idea described in section 3.6.1: it lies in recognizing, or not, every new incoming packet. However, the differences with the former solution are listed below: • The memory allocated to store events is limited to a fixed sized list. When this list is saturated, then an incoming packet is stored on the beginning of the list. • A node does not actually store the whole received packets, but only their issuer as well as a serial number. Therefore each packet stored uses only a fixed limited size in memory. This is an approximate solution: a list stores all received packets’ issuer and serial number. If a new packet is received when the list is already full, this new packet’s issuer and serial number replace the first value in the list, and so do the following packets with the list’s following cases. When a packet is checked for having or not be already received, the node browses the list from its beginning to the last case having a value (search length in figure 3.15. If it can find a pair issuer/serial identical to the values from the packet received, then this packet is considered as a duplicate. Limitations The use of an approximate solution introduces the possibility of errors in duplicates detection: it may not recognize as duplicate a packet which has actually been already received. This solution to duplicate detection relies on two assumptions to guarantee every duplicate packets can be detected. First it assumes every packets are sent with its issuer’s id as well as a unique serial number. When emitting a packet, it is the issuer’s duty to make sure it joins it’s own id and a different serial number for each different emitted packet. Second, the list must be long enough to handle every packets circulating in the network at the same time. If the list is too short, a new incoming packet might override the older received one, making the node to forget about it. If this old packet is still circulating in the network and reaches again this node, then it is not detected as a duplicate. The first assumption is relatively easy to implement: in each packet’s header is included its issuer’s id, and a serial number. This serial number can be an integer incremented each 40 Notification Notification point Issuer Serial Content Content length typedef s t r u c t { notif type ; event_content content ; unsigned char i s s u e r ; unsigned char s e r i a l ; int length ; } event ; Figure 3.16: A C structure describing an event message in event flooding publish/subscribe algorithm. time the node sends a new packet. However, the second constraint is much more difficult to satisfy: it requires to know the maximum number of packets which can be circulating in the network and willing to reach the same node at the same time. As this number remains very hard to estimate, a good way to get a minimal working value could consists in running the network with the worst possible conditions, and increase the list’s length until the network behaves as expected. Duplicate detection and event-flooding algorithm Event-flooding algorithm is based on flooding every notification received through all existing channels, and check if this notification matches its own subscriptions. As described in figure 3.14, this can make the same notification to be delivered several times to the same node. A forwarder or a subscriber must be able to detect notifications it has already received so that it doesn’t broadcast or process it again. This way a fired notification doesn’t not circulate forever in the network. As expressed in section 3.6.1, notifications must then encapsulate their issuer’s id as well as a serial number, and the list’s length can be set to a minimal working value by testing the algorithm under the worst possible working conditions. In event flooding publish/subscribe algorithm, such conditions could be defined as the conjunction of the following features: • A high number of nodes in the network. • A high network connectivity, with a lot of cycles and many different possible routes between two different nodes. • A high rate of simultaneous communications; in event flooding publish/subscribe, it means every node fire an event at the same time. Data structure Event flooding only manipulates events by firing, forwarding and filtering them. There is no matter of forwarding subscriptions or unsubscriptions. Therefore only events have a useful data structure. Figure 3.16 shows all information contained in a flooding-based pub/sub algorithm. Type allows to match this event with a subscription so that it is possible to filter it. Content, which length is bounded by length, provides the actual content of the event, as the single 41 attribute subscription model does not allow to carry so much information in the event’s topic itself. At last, issuer and serial are the two fields allowing the detection of duplicates thanks to the method described in section 3.6.1. 3.6.2 Event filtering-based strategy In this section is described the algorithm implemented in “Event_filtering” class on figure 3.6. Event-filtering routing algorithm belongs to deterministic algorithms, in the sens that it relies on a deterministic routing table to take the decision on where is it relevant to forward a notification or not forward it at all. If its behavior is therefore deterministic, then its performances may be affected by underlying network’s dynamic connectivity. This is especially true concerning deliveries correctness. The general mechanism of filtering-based routing algorithm is well known and described in [14]. However, it may be relevant to adapt it to the particular context of modular robots and more precisely networks formed by Atron modules. The following sections provide a notice when such adaptation is relevant. The algorithm The idea behind the event-filtering is rather simple: subscribers broadcast their interests, and while a node receives this broadcast, it also notes what subscription comes from what channel. This subscription is then associated to the channel, and the channel is associated with the union of all subscription received through it. This information allows it to know in what direction are nodes interested in an event matching this subscription. In other words, it builds a routing table using reverse path learning. That way, events are no longer broadcasted to all nodes in the robot, but only to the interested ones. Initialization Consider the underlying physical network as the one in the top of figure 3.4. The algorithm is initialized by calculating distributed a spanning tree as explained in section 3.6.2. Broadcasting and forwarding subscriptions The basic principle consists in making every node to be responsible of all events it has received a subscription for. When a node n receives a subscription s through its channel c, the following process occurs: 1. Check if the subscription associated with channel c covers the subscription s. If it covers it, then it means n already forwards events expressed in subscription s. s is therefore not forwarded. 2. If s is not covered, n subscribes to all information it needs to satisfy s. It does so by checking each of its channels except c. For each of them, it checks what subscriptions it has already sent through it and sends a subscription requesting the missing notification subset. Note it is possible that on all channels but c, n has already subscribed to the right events. In this case, n doesn’t subscribe to anything, but carries on with step 3. 3. Integrate the subscription in the event routing table. Integrating a subscription in routing table consists is associating the subscriber’s id as well as channel c with the subscription. If a subscription already exists for this issuer, then the incoming channel c is updated and the associated subscription becomes the union of the already existing subscription and newly received one. Step 1 requires to calculate if the subscriptions associated to c covers s. Subscriptions associated to c are kept in memory (see below) and the principle of subscriptions covering each other is described in sections 3.4.2 and 3.4.3. Step 2 iterates on all available channels being part of the spanning tree (set B) except c (Bc in equation 3.8). For each channel 42 Bc = B \ {c} (3.8) Sd = {x | ∀x ∈ N ∧ ∀e ∈ C : ∃e ∈ Bd ∧ x ∈ Ue } Fd = s \ Sd (3.9) (3.10) • C: Set of all available channels. • B: Set of all channels part of spanning tree. • Bc : Set of all channels part of spanning tree, minus channel c. • Ue : Union of all subscriptions received through channel e. • Sd : Union of all subscriptions sent through channel d. • Fd : Final subscription forwarded through channel d. Figure 3.17: Only the green part of s in forwarded through channel d d in Bc , it calculates what subscriptions have already been sent through (Sd in equation 3.9). Then it removes Sd from s as shows equation 3.10 (assuming Ue is the union of all subscriptions received through channel e). It happens to be equivalent to the operation minus(s, Sd ). If the resulting subscription Fd is not a null subscription, then it sends it through channel d. This is also shown in figure 3.17. Step 3 memorizes in event routing table the new subscription for node i. If pi,c is the previous subscription for node i through channel c and si,c the received subscription, the new subscription is given by pi,c ∪ si,c . Broadcasting and forwarding notifications Once subscriptions have been spread out through the network, broadcasting an event e is a relatively easy task. An event is sent through every channel c which associated subscription Uc matches e. In other words, the decision whether send or forward an event e through channel c or not, depends on the equation 3.11. e ∈ Uc (3.11) When a node fires an event, it iterates on all its channel, verify the property 3.11 and send it through it if the test succeeded. Forwarding an event is very similar, except a node forwarding an event doesn’t forward it through the channel it has received this event from. For a given event e fired by a node p, and subscribers s1 , s2 and s3 interested in e (which subscriptions match e), a new network layer is dynamically and implicitly formed, from pruning the spanning tree and keep only links between event e’s publisher and all interested subscribers. An example of such a “layer” is shown at the bottom of figure 3.4. Broadcasting and forwarding unsubscriptions Forwarding unsubscriptions is a similar process to forwarding subscriptions. Again, each node is responsible for forwarding 43 Figure 3.18: In step 1: Sc is the union of all subscriptions associated to channel c except the one from unsubscription’s issuer. In step 2: Sc is all the subscriptions sent through channel c except the one from the unsubscription’s issuer. In both steps, Sissuer is the subscriptions previously expressed by the issuer. Step 1 succeeds, or in step 2 the unsubscription is forwarded through channel c, if the black area is not a null subscription. The black area is then considered as being the actual unsubscription. notifications to interested nodes. To that purpose, it maintains its routing table as small as possible, and send unsubscriptions about a notification subspace as soon as it no longer needs events matching it. As in forwarding subscription, the algorithm takes place in three steps: When receiving an unsubscription s through channel c: 1. Checks if the unsubscription changes how the notifications will be forwarded after unsubscribing. If the black part on figure 3.18 is null, then the unsubscription has no effect and the process is stopped. 2. Send through each channel in spanning tree except c, an unsubscription from the no longer needed notification subset. 3. Remove the notification subset carried by the unsubscription received, from the record in event routing table associated with unsubscription’s issuer. The goal in step 1 consists in calculating what part of unsubscription is actually worth to process. If the resulting unsubscription is null, then the algorithm stops. If it is not, both steps 2 and 3 are processed. The actual unsubscription (which is forwarded) is calculated thanks to equations 3.12 and 3.13. In step 2, if the actual unsubscription is not null, then for each channel d in broadcast tree, equation 3.15 computes the unsubscription to be sent through it, in order to perform unsubscription received from issuer i through channel c. Step 3 consists in removing the unsubscription ui,c from subscription previously associated with i. If pi,c is the previous subscription associated to node i and channel c, then its new subscription is pi,c \ ui,c . Atron’s specifics Atron’s particular characteristics makes every channels to lead to / Step 1 is therefore at most one neighbor node. As a consequence, Ri,c always equals 0. reduced to calculate the intersection between the unsubscription received through channel c and previous subscriptions associated to it. The result of intersection must be different than 0/ so that the algorithm continues. Cope with network dynamics A central point of communication in network which connectivity is constantly changing, consists in coping with this permanent topology changes. Obviously with this event filtering-based method, a problem lies with maintaining both spanning tree and event routing table. The basic solution to manage such dynamic connectivity, consists in reinitializing the algorithm when a node realizes a change occur ed in network topology. When this happens, the node spreads out a reset request. This request is broadcasted to all nodes using a flooding method. Every node receiving such a reset request wipe event routing table, recompute 44 Ri,c actual ui,c Si,d sent ui,c,d = {x | ∀x ∈ N ∧ ∀m ∈ Mc : x ∈ sub(i) ∧ m 6= i} (3.12) = (sub(i) \ Ri,c ) ∩ ui,c (3.13) = {x | ∀x ∈ N : ∃e ∈ Bi ∧ x ∈ Ri,d } (3.14) = (sub(i) \ Si,d ) ∩ uactual i,c (3.15) • ui,c : Unsubscription received from issuer i through channel c. • uactual : Actual unsubscription from Ui,c after step 1. i,c • usent i,c,d : Unsubscription sent through channel d when forwarding unsubscription ui,c . • Si,d : Subscriptions previously sent through channel d, except the one issued by module i. • sub(m): Subscription associated to node m. • Mc : Set of neighbor modules through channel c. • Ri,c : Set of subscriptions received from channel c, except the ones issued from node i. a spanning tree and broadcast again their own subscriptions only (not the one they used to be responsible for). Main data structure Section 3.4.3 presents the data structures and operations as implemented for single attribute subscription model. As described in the previous sections, event filtering-based algorithm needs more information than notification subsets or points, so that to work properly. This section introduces actual data structures used in event-filtering algorithm. These data structure include single attribute model’s data structures, or any other subscription model that may be used. Routing table The algorithm maintains an event routing table thanks to which it is possible for it to route events but also to dynamically cope with successive subscriptions and unsubscriptions. This is three different possible use case for the routing table. When forwarding notifications, nodes only rely on the union of subscriptions received through the same channel c (named Uc ). In order to route notifications, the algorithm needs Uc for each c in the set B of all available channels in spanning tree. Broadcasting subscriptions does not require more information, as all decisions are all based on subscriptions associated to channels as a whole. However, unsubscriptions induces a greater complexity: considering two nodes n1 and n2 connected to a same node n through a same channel c, and which subscriptions intercept each other. If n1 unsubscribes from a notification subset n2 has subscribed to, then removing the notification subset carried by n1 ’s unsubscription from Sc in node n might affect event deliveries to n2 while it should not. Therefore, unsubscriptions require every nodes to keep in memory what subscriptions every individual neighbor node has itself, subscribed to. This finer knowledge still allows to compute Uc whenever needed. It also allows to manage neighbors’ unsubscriptions without affecting other neighbors linked through the same channel. Figure 3.19 gives a representation of an entry in the routing table. Issuer represents a node’s id; it is likely to be an integer. Channel represents the channel from which the last 45 Subscription table entry issuer channel subscription typedef s t r u c t { char i s s u e r ; chan c h a n n e l ; sub s u b s c r i p t i o n s ; } sub_table_entry ; Figure 3.19: A representation of an entry of an event filtering-based routing table, and its equivalent in C. Subscription Notification subset Issuer typedef s t r u c t { sub sub ; char i s s u e r ; } subscription ; Figure 3.20: A data structure for subscriptions and its modelisation in C. subscription from issuer has been received through. Again, a channel’s identifier is likely to be an integer. Finally, subscription represents the notification subset issuer has subscribed to. This data structure is given by the subscription model implemented. Supposing it uses the single attribute implementations introduced in section 3.4.3, this is a 32-bits integer. Considering a node can have at most c channels, each channel allowing to directly contact at most n neighbor nodes, the complete routing table contains c × n entries. However, the nature of peer-to-peer network in Atron-based modular robots present particular properties. Among them is the number of nodes accessible through the same channel, which is at most 1. In addition, all modules in a Atron-based modular robot are similar, and consist of 8 communication devices or channels. Therefore on Atrons, n = 1 and c = 8 so there are 8 entries in its routing table, regardless how many modules the whole Atron-based robot is composed of. Also, since only one node is reachable through a given channel, either channel or issuer is a redundant information and one of them can be removed from the routing table. Subscription As described in section 3.6.2, the event filtering algorithm needs subscriptions issuers’ id so that it can set up an event routing table. Therefore, every publishers join their id with their subscription. Figure 3.20 shows a subscription in a event filtering-based publish/subscribe network. It carries the issuer’s id (an integer) as well as the publisher’s interest. This latter field in the data structure depends on the subscription model employed; the use of single attribute subscription model means this field is a 32-bits integer. The previous section shows that on Atrons, either node’s issuer or incoming channel is redundant. As the incoming channel is always known, the issuer field in subscriptions for event filtering-based algorithm could be ignored and removed. 46 Notification Notification point Content Length t y p e d e f u n s i g n e d c h a r e v e n t _ c o n t e n t [EVENT_MAX_LENGTH ] ; typedef s t r u c t { notif type ; event_content content ; int length ; } event ; Figure 3.21: A data structure for notifications and its modelisation in C. Unsubscription Notification subset Issuer typedef s t r u c t { sub sub ; char i s s u e r ; } unsubscription ; Figure 3.22: A data structure for unsubscriptions and its modelisation in C. Note its similarity to data structure for subscriptions (figure 3.20). Notification The section 3.6.2 shows how simple is forwarding a notification using the event routing table. The only useful information is the notification itself, as the one described in section 3.4.3. However, the use of single attribute subscription model means the notification doesn’t carry very much useful information. Therefore, a notification also includes a field “content”, a sequence of bytes, where any useful information can be joined. Another field, length, can also specify how long is the sequence of bytes for notification’s content. Figure 3.21 gives a representation for notifications in event filtering algorithm. Unsubscription In event filtering algorithm, an unsubscription is roughly equivalent to a subscription. The only difference is the data structure type’s name, which allows to make a distinction between a subscription and an unsubscriptions. As shown in the section 3.6.2, the event filtering algorithm only needs to know the unsubscriber’s id (the issuer) as well as the unsubscription set. An unsubscription for event filtering algorithm is shown in figure 3.22. As for subscriptions, the issuer field becomes redundant and useless when event filtering algorithm is run on Atron-based modular robot. An Atron module doesn’t need any unsubscription’s issuer’s id when receiving an unsubscription through a given channel, since only one module can communicate through that channel. 3.7 Conclusion Two algorithms are implemented for atron modular robots. A first one is based on event flooding: a simple algorithm, yet it requires a mechanism to handle a network with loops. 47 This mechanism introduces complexity to this algorithm supposed to be very simple. It may even cause it to commit mistakes in event deliveries: it can deliver several times the same event if too much events are circulating at the same time, and if its memory is too small to store and recognize all of them. This section also describes in details the implementation of filtering-based or routing table-based algorithm, and show in details all the complexity involved in this technique. However, it doesn’t provide any solution for this algorithm, so that it can handle efficiently a dynamic underlying network. 48 Chapter 4 Experiments 4.1 Introduction This section describes the measurement of performances for algorithms which implementations is described in section 3. An algorithm’s efficiency is measured along several characteristics such as its complexity (spatial or temporal) or how much communication it generates. Here the algorithms are designed for modular robots: an important feature for them is scalability. Scalability is the ability to keep properties of efficiency on modular robots composed of an high number of modules. In this section is a description from survey in [8], on what features are relevant to measure. It also describes the performances measurement methodology implemented in this work as well as the results obtained when testing implemented algorithms. This section closes with a conclusion where the results are discussed. 4.2 Relevant features to measure Legatheaux Martins et al. [34] define a correct routing algorithm as an algorithm without false negative. A false negative is a missing delivery of an event to an interested node. On the other hand, an optimal algorithm is an algorithm with no event deliveries to nodes having not expressed any subscription matching it; this is called false positives. The main issue consists in finding the happy middle between improving routing notifications to hosts (reduce or eliminate false negatives and false positives) and reducing calculation, memory consumption and communication overhead. This latter characteristic is critical to make the communication system usable with a high amount of nodes in network. Baldoni et al. [8] suggest three features to consider when evaluating a routing algorithm: • Message overhead: Amount of messages sent by the system when sending either notifications or subscriptions. A convenient measure for message overhead is message “hops”: the number of links a messages strives to reach all its recipients. • Memory overhead: Amount of information every individual node has to memorize to ensure the routing algorithm can work properly. Mostly related to subscription memorization by each node, so that it can forward notifications. • Subscription language limitation: The routing algorithm can limit expressiveness of subscriptions the system can handle. Beside, when evaluating a publish/subscribe system for dynamic networks, several conditions of functioning must be considered, along different dimensions: • Underlying network dynamics: nodes failing, leaving, joining or changing location, affect network’s connectivity. 49 • Dynamics regarding nodes interests with live subscriptions and unsubscriptions. It affects publish/subscribe-level network topology and can induce false positives or false negatives. • Different publishers/subscribers ratio, especially for stochastic-based routing algorithms. How they can manage a small number of subscriber with numerous publishers, the contrary situation or a more balance ratio. • Number of nodes in the network: test of event routing system’s scalability. A event routing algorithm shows good performances with a couple of dozen nodes, which drastically drops with a network composed of hundreds or thousands nodes. An ideal event routing algorithm assures every events take the shortest path to reach all its subscribers, and not any other node (no false positives nor false negatives). In addition, it must guarantee this property regardless of dynamic reconfigurations both in underlying network and subscriptions and unsubscriptions. It must work with a low, an high or a medium publisher/subscribers ratio, and be equally efficient in small and big networks. 4.3 Measurement methodology Performances are measured along several different possible features concerning a robot’s characteristics. These characteristics can be the number of modules it is composed of (section 4.3.1), or the use it makes of publish/subscribe communications (section 4.3.2). A system can be composed of a restricted number of subscribers and many publishers, on the contrary many subscribers and a limited number of publishers, or a more balanced publishers/subscribers ratio. When measuring performances, both these aspects are considered for all algorithms analyzed. The performances measurement focuses on the ratio between the number of emitted messages (subscriptions/unsubscriptions or notifications) and the number of nodes. This ratio can be counted in two ways: Mean message count per module in the global robot This mean value would be obtained by counting the global amount of messages sent in the global modular robot, divided by the number of modules. This procedure would be repeated several times and a global mean and standard deviation can be extracted from all these repetitions. A low standard deviation would denote the algorithm’s behavior is stable. Mean and standard deviation in message count for each individual module. This method would focus on counting, for each module, how many messages it has sent. The mean value from this counting method denotes how much communication the network has to handle. The standard deviation can also give information about how this communication load is distributed among the nodes: a low standard deviation means the load is evenly balanced and a high value denotes a bottleneck. Measuring false positives and false negatives remains very hard: depending on algorithms, subscriptions can be not expressed at all, or expressed in answer to a previous incoming subscription. This makes difficult to monitor the network, listen to matching subscriptions, unsubscriptions and notifications and correlate them to detect false positives and false negatives. 50 Figure 4.1: Three modules compose a unitary tile and its associated network topology. Figure 4.2: A robot composed of 3×1×3 tiles, or 27 modules; notice the network topology has a lot of cycles. 4.3.1 Performances along number of modules in a robot Simulations in performance measurements involve robots which size is expanded along successive simulations. A growing number of modules allows the measurement of impact the number of modules composing a robot has on the overall algorithms’ performances. Robots in simulations are composed of one or several unitary tiles as the one shown in figure 4.1. These tiles are organized into a rectangular parallelepiped; figure 4.2 represents a 3 × 1 × 3-tiles example. A robot composed of several such tiles is a robot which physical network is dense and composed of several loops. This characteristic represents a good challenge for the algorithms to be tested, as they has to cope with many cycles. As the number of modules grows, the distance between publishers and subscribers may also grow. The same goes for the number of publishers and/or subscribers. A growing number of modules may therefore lead to a growing amount of communication overhead, depending on the algorithm and the scenario of use of publish/subscribe system, as section 4.3.2 describes it. 51 Subscribers Publishers Few Many Few 1 2 Covering 3 5 Many Non-covering 4 6 Table 4.1: 6 different possible scenarios in the use of a publish/subscribe system. 4.3.2 Performances along publishers/subscribers ratio in the system When simulated several times, the same robot configuration can produce very different network overheads for the same publish/subscribe algorithm. The differences can come from a different use of the publish/subscribe system, like the number and the ratio of publishers and subscribers. Table 4.1 illustrates six different possible scenarios. Covering and non-covering express the fact that every subscribers’ subscriptions cover each other, or they don’t intercept each other at all. This may influence the algorithms’ performances. Different scenarios may give an advantage on one algorithm over the other. Scenario 1 In scenario 1, there are only one subscriber and one publisher. Both of them publish and subscribe about the same topic. Both are situated at opposite corners in the robot’s structure. This scenario makes both algorithms to generate a few traffic, because of the small amount of subscribers and publishers. Scenario 2 The second simulation scenario involves one module subscribing to a topic. All other modules publish notifications matching this topic. The subscriber is expected to receive them all. This exposes flooding-based algorithm to much worse performances, compared to event filtering algorithm. Scenario 3 In this scenario, every modules but one subscribe to the same topic. The remaining module publishes about this topic and this topic is expected to reach once every other modules. This scenario does not give more advantage or disadvantages to one or other event routing algorithms. Scenario 4 This scenario involves again one publisher and all other modules as subscribers. Subscribers subscribe to a different topic, each topic not intercepting each other. Only one module is interested in the notification fired by the publisher. This big number of different subscriptions makes this filtering-based algorithm working in bad conditions compared to the flooding-based one, very comfortable with a unique publisher. Scenario 5 In scenario 5, half of the modules subscribe to the same topic, and the other half of modules publish about this topic. Subscribers and publishers are evenly spread out in the robot. Every subscriber must receive every notifications. This scenario allows event filtering algorithm to take profit of similar subscriptions, whereas it makes event-flooding to flood many events as in scenario 2. Scenario 6 Scenario 6 is similar to scenario 5, except subscribers subscribe to topics not intercepting each other. All publishers also publish topics in which only one subscriber is interested. Every subscriber must receive exactly one notification, except if the number of modules is odd and their is one more subscriber than publishers. This scenario gives both algorithms a disadvantage. All these scenarios may show algorithms’ strengths and weakness when subscribing or publishing. Some scenarios may be in favor of an algorithm and penalize the other, and some other scenario may reverse the advantage. 52 Initialization and synchronization Algorithms initialization and synchronization can also induce a lot of network overhead. Both algorithms use distributed barriers as described in section 3.3.1 which floods messages in the network. Also, event filtering method employs a loop-free broadcast tree obtained by pruning the physical network’s connectivity. This spanning tree can be calculated thanks to the method described in section 3.6.2. The cost of this initialization in term of network overhead is analyzed by counting the number of barrier or spanning tree-related packets, function of the number of modules in the robot. 4.3.3 Performances along the number of events sequentially fired A realistic use of a publish/subscribe system consists in one or several modules firing several notifications (dozens, hundreds or thousands) after the system has been initialized. This variant can be played in all scenarios described in section 4.3.2. Firing several notifications might affect the performances of one algorithm compared to another. A few events fired may prove event flooding more efficient because of the expensive initialization of filtering algorithm. However this ratio can be reversed when the number of events fired makes the expensive event flooding approach to generates more messages than filtering algorithm’s initialization. 4.3.4 Performances along event firing rate A controller using a publish/subscribe system can fire events both from time to time and very frequently. A very high event firing rate can lead to poor performances or even errors. An example of error due to a too high number of messages circulating in the network in explained in section 3.6.1. It shows flooding technique’s possible limitation in duplicate detection. It may have a bad behavior when too much messages are received at the same time, such as indefinitely letting messages to circulate in a loop. 4.4 Algorithms performance and comparison 4.4.1 Experimental measurements of messages count Experimental setup Measurements are run thanks to a simulator for modular robots[16], where communications are monitored. However, memory limitations in the simulator prevents the simulations of robots bigger than a 3 × 1 × 3-tiles robot (robot in figure 4.2). It is able to simulate a virtual modular robot (Odin, Atron and more) as well as run a controller for them. The simulator also allows to monitor communications, thus it eases algorithms’ message overhead measurement. The measurement consists it counting the mean message per module in the global robot, as the first message counting method described in section 4.3. All six scenarios for each different robots are simulated 30 times. Experimental measurement Running the experiments produces the graphs shown in figure 4.1. They show that flooding method generate a lot of messages for notifications and subscriptions amount remains zero. This can be expected as event flooding-based method does not use subscription messages at all. On the other hand, event-filtering algorithm generates a significant amount of subscriptions. However, when many nodes subscribe to the same topic, this tendency tends 53 to be lowered. Event notifications in filtering-based approach generates a messages traffic almost constant or not growing very much with the number of modules in the simulated robot. First simulation clearly shows that flooding method generates more traffic (including both notifications and subscriptions) than filtering-based approach. This is due to the flooding approach that makes every nodes to broadcast messages through all their channels. If filtering approach also uses broadcast for subscriptions, simulation 1 shows that this broadcast generates far lower messages than flooding algorithm. The message overhead induced by filtering’s algorithm to route events does not compensate this difference. The second simulation outlines this differences. This scenario clearly shows limitations of flooding-based algorithm when having to cope with multiple successive notifications. In flooding approach, publishing more events for a unique subscriber makes all nodes to forward the same messages several times. Filtering method make these events to be forwarded toward the recipient, directly, and toward them only. As a consequence and comparatively with filtering method, message traffic for event flooding-based algorithm raises very fast. Third simulation stresses the behavior of event filtering algorithm, when several nodes have to subscribe to the same topic. The curve 4.1(c) demonstrates that the subscription filtering feature of filtering algorithm limits efficiently the broadcast of subscriptions: it stays far lower than message traffic due to events propagation with flooding technique. The total number of messages (notifications and subscriptions) is also much lower for event filtering approach than event flooding. Fourth simulation on the other hand shows the limitation of filtering capabilities of filtering algorithm. When every nodes subscribe to different topics, all these subscriptions have to be broadcasted to all nodes; one broadcast per node’s subscription. This is the worse possible scenario for event filtering-based strategy, where even event flooding-based method seems to be more efficient. Fifth simulation allows to have a comparison on the behavior of event flooding-based and event filtering methods in a disadvantaging scenario for both algorithms. On one hand event flooding strategy has to handle multiple notifications, and on the other hand, filteringbased method has to manage multiple similar subscriptions. The simulation’s results shown in figure 4.1(e) demonstrates the better performances of event filtering algorithm. Sixth simulation aims to reproduce fifth simulation, but making worse the situation for event filtering method. Not only many nodes subscribe, but their subscriptions do not intercept each other. However, figure 4.1(f) shows that event-filtering method still shows better performances than event-flooding algorithm. Unexpected peak Figures 4.1(c) and 4.1(d) show a peak when the amount of modules reach 18, then a drastic fall. This is an unexpected result, as the number of notification message is supposed to drastically grow with the number of modules. Running more simulations, hundred times or more per scenario and robot configuration could make different results. Figure 4.2 obtained running 5 times each simulation, tries to show the possible results of such an attempt; however, this experimentation should be run much more times to get reliable results. The small number obtained from the simulations could be the result of a too short running time for simulation. An attempt could also consist in running more simulations, letting them more time before stopping them. This way all messages could have the time to be spread out. However, every simulation already runs 15 seconds, which should be far enough to spread out both all subscriptions and all notifications. Message count related to initialization and synchronization Figure 4.3 represents the amount of messages due to algorithms initializations and synchronization. Initialization prepares the algorithm before it can actually route subscriptions and 54 event-1pub-1sub 2.5 Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total Mean packet count per module 2 1.5 1 0.5 0 5 10 15 20 25 Module count (a) Message count functions of module count with scenario 1 (1 subscriber and 1 publisher). event-Npub-1sub Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total 40 35 Mean packet count per module 30 25 20 15 10 5 0 5 10 15 20 25 Module count (b) Message count functions of module count with scenario 2 (1 subscriber and several publishers). 55 event-1pub-NsubCovering Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total 3.5 Mean packet count per module 3 2.5 2 1.5 1 0.5 0 5 10 15 20 25 Module count (c) Message count functions of module count with scenario 3 (1 publisher and several subscribers to a same subscription). event-1pub-NsubNonCovering Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total 10 Mean packet count per module 8 6 4 2 0 5 10 15 20 25 Module count (d) Message count functions of module count with scenario 4 (1 publisher and several subscribers to non intercepting subscriptions). 56 event-Npub-NsubCovering 30 Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total Mean packet count per module 25 20 15 10 5 0 5 10 15 20 25 Module count (e) Message count functions of module count with scenario 5 (several publishers and several subscribers to subscriptions covering each other). event-Npub-NsubNonCovering Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total 30 Mean packet count per module 25 20 15 10 5 0 5 10 15 20 25 Module count (f) Message count functions of module count with scenario 6 (several publishers and several subscribers to nonintercepting subscriptions). Figure 4.1: Message count of flooding-based and filtering-based approaches (synchronization and initialization are not taken into account). 57 event-1pub-NsubCovering Flooding interests Flooding notifications Flooding total Filtering interests Filtering notifications Filtering total 2.5 Mean packet count per module 2 1.5 1 0.5 0 18 19 20 21 22 23 24 25 26 27 Module count Figure 4.2: The unexpected behavior shown in figures 4.1(c) and 4.1(d) does not always reproduce. notifications. For event-flooding method, no initialization is required but filtering-based approach builds the spanning tree in initialization phase. This algorithm induce a significant amount of messages to be spread out through the network. Synchronization is about the use of a distributed barrier when all modules in a robot are starting up, or between the initialization and running phase of filtering algorithm. Both flooding and filtering algorithms use synchronizations. On figure 4.3, it first obvious that initialization and synchronization generates much more traffic than both event flooding and filtering algorithms as measured in section 4.1. It also shows that event filtering-based implementation generates more overhead than event flooding. This is due to the building of the spanning tree, as well as the use of more distributed barriers. 4.4.2 Space analysis This section attempts to evaluate the memory requirement to run publish/subscribe algorithms. This space complexity is evaluated through the number n of modules composing the modular robot. A complexity of O(n) means the required memory amount is proportional to the number of modules. Publish/subscribe algorithm’s space complexity Event flooding Section 3.6.1 shows that event flooding algorithm requires little memory. It also shows that it nevertheless has to handle message duplicates. The memory used for that purpose may be extended while the number of nodes and well as the network connectivity gets dense or the number of events are fired. As the size of this memory does not increase directly nor only after the number of modules, its space complexity is considered as O(1). Also, the memory used to store all local subscriptions does not depend 58 Initialization and synchronization 140 Flooding Filtering 120 Mean packet count per module 100 80 60 40 20 0 5 10 15 20 25 Module count Figure 4.3: Message overhead due to initialization and synchronization. on the number of nodes and is also considered as being O(1). Filtering routing As explained in section 3.6.2, filtering-based algorithm builds a routing table which consists in what interests have nodes in the direction of each channel. Also, for each channel the algorithm stores the individual subscription of each neighbor. Therefore filtering algorithm uses a routing table consisting of c × b records, where c is the number of channels and b the number of neighbors directly accessible through the same channel. On Atron-based modular robots, c is lower than or equals 8, and b is always 1. Neither c nor b grow with the number of modules in the global robot. Therefore, the spatial complexity of the filtering algorithm itself is O(1). In the implementation introduced in section 3.6.2, mores mechanisms are employed to build a spanning tree as well as synchronize modules. The spanning tree construction requires every module to mark themselves and their channels, independently from their neighbors. The amount of required markers is therefore independent from the number of modules in the global robot, so the algorithm’s space complexity to build the spanning tree is also O(1). Actual implementation with synchronization As explained in section 3.4.1, the pieces of code in PublishSubscribe class (figure 3.6 runs both a double distributed barrier for synchronization, and a memory buffer to move bytes from one thread to one another. This memory buffer is fixed-size, although this size may be increased in situation involving many incoming messages at the same time. This is particularly important when the network has a high connectivity: distributed barriers as well as underlying publish/subscribe algorithm may make an intense use of flooding. The space complexity for messages management is considered to be O(1). 59 Flooding Flooding Initialization & synchronization O(n) O(n) Algorithm O(1) O(n) Total O(n) O(n) Table 4.2: Space complexity of different algorithms’ implementation. On the other hand, distributed barriers requires to remember the state of each node in the robot, thus a memory directly proportional to the number of nodes is necessary. Space cost for synchronization is therefore O(n). Other memory required to make the whole algorithm to work are simple integer variables and their spatial complexity does not exceed O(1). The space complexity of the common code part dedicated to messages reception as well as algorithms’ synchronization is O(n). 4.5 Conclusion The complete implementation of a publish/subscribe system is composed of an algorithm and their synchronization. As described in the previous sections and shown in table 4.2, algorithms’ complexity is O(n) and synchronization is O(n). The complete implementations’ space complexity is therefore O(n). Sections 4.1 and 4.4.1 describe different scenarios to stress different algorithms’ performances. On one hand scenario 2 features both the worse case for event flooding algorithm and the best case for event filtering-based method: a few subscriptions and many notifications. On the other hand scenario 4 advantages event flooding and disadvantages event filtering strategy: many different subscriptions and a few notifications. The most interesting scenarios are scenario 1 featuring the best case for both algorithms (unique subscription and notification), and scenario 6 which runs the worse case also for both of them (several distinct subscriptions and several notifications). Scenarios 3 and 5 allow algorithms to combine their own properties with opportunities scenarios offer: few notifications to route on one hand and filtering similar subscriptions on the other hand. Event flooding strategy’s performances takes over event filtering-based method results only when the former is in ideal conditions and the latter in its worse case. All other simulated situations demonstrate a more message-efficient communications management from event-filtering algorithm. However, measurements show for both algorithms a exponentiallike curve for sent messages count function of number of modules in the robot. However, section 4.4.1 shows that initialization and synchronization actually consume much more messages than event routing algorithms themselves. This measure demonstrates the cost of event filtering approach to make it more efficient that event flooding. However, algorithms involved in synchronization and calculation of distributed spanning tree is out of scope of this study. 60 Chapter 5 Conclusions 5.1 Future work This work mainly focuses on implementing and evaluating different publish/subscribe algorithms. This study is more interested in their efficiency in terms of network communication and how they cope with dynamic networks. However, it provides few solutions to cope with network dynamics, especially concerning event-filtering strategy. Coping with dynamic underlying network would mean in that case, to adapt both the broadcast tree and event routing tables to the changes. Event flooding strategy may not benefit of any further improvement. In fact beside its very high cost of functioning, its reliability regarding network topology changes makes it already ideal: the simplicity of its functioning makes sure every events are delivered to interested nodes. On the other hand concerning event-filtering strategy, lots of improvements can be done: A better spanning tree algorithm can be implemented like the one described in [26], or in [36]. The latter solution introduces a solution to repair broken spanning trees due to nodes’ failure, leaving or arrival. This would allow to carry on with implementing a filtering algorithm fully able to cope with network topology changes. Finally, gossip-based algorithms may also be investigated in order to provide an efficient way to handle swarm modular robots’ communications. 5.2 Conclusion Publish/subscribe systems represent a well-known approach in computer sciences or in pervasive computing. These algorithms allow to decouple data producers from data consumers, thus they are suitable in networks where nodes and links are expected to change in a random way. This is the case of dynamically or self-reconfigurable modular robots: while the robots function, it is possible the modules physically rearrange (eventually themselves), affecting the communication network’s topology. Their network can be assimilated to mobile ad-hoc networks (MANETs) and publish/subscribe solution exists for such networks. No modular robot-related publication describes in detail the design of publish/subscribe communications. However, the decoupling between publishers and subscribers could drastically ease the development of algorithms for modular robots. This work investigates on different publish/subscribe algorithms for modular robots and more particularly for Atron modular robot developed at Mærsk Mc-Kinney Møller Institute (University of Southern Denmark). A first simple approach consists of flooding events through all communication device whereas a second relies on modules’ subscriptions to forward efficiently events through the network. Performances measurement in a static environment generally shows drastically better performances for techniques based of event routing table and efficient event forwarding over pure event-flooding method. However 61 these better performances come to the cost of a much more costly initialization. This to build a proper loop-free network by pruning links that form loops. Also, no test has been run in a dynamic network and event routing table-based solution’s performances might suffer a lot more from it than the flooding method. 62 List of Figures 1.1 1.2 1.3 1.4 1.5 Fukuda’s cellular robot . . . . . . . . . . . . . . Two modular robots . . . . . . . . . . . . . . . . Polybot chain-type robot in a wheel configuration An atron robot and an atron module. . . . . . . . Multiple configurations makes different networks 2.1 2.2 2.3 Channelization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Split the notification space in cells and map them to groups . . . . . . . . . 17 Notification space composed of two numeric attributes . . . . . . . . . . . 18 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21 3.22 Deployment diagram of the publish/subscribe system on an atron robot. An example of a distributed barrier . . . . . . . . . . . . . . . . . . . . Event routing example for flooding algorithm . . . . . . . . . . . . . . Pruning the physical connectivity in filtering algorithm . . . . . . . . . Nodes’ interactions with filtering algorithm . . . . . . . . . . . . . . . Class diagram of the implementation . . . . . . . . . . . . . . . . . . . C code for subscription initialization . . . . . . . . . . . . . . . . . . . C code to check if a subscription is null . . . . . . . . . . . . . . . . . C code for subscription union . . . . . . . . . . . . . . . . . . . . . . . C code for subscription intersection . . . . . . . . . . . . . . . . . . . C code for subscription difference . . . . . . . . . . . . . . . . . . . . C code for subscriptions covering . . . . . . . . . . . . . . . . . . . . C code for matching notifications . . . . . . . . . . . . . . . . . . . . . Packet broadcasted in a cyclic network without duplicate detection . . . An example of the packet memory list. . . . . . . . . . . . . . . . . . . C structure for an event in flooding algorithm . . . . . . . . . . . . . . Only the green part of s in forwarded through channel d . . . . . . . . . Filtering unsubscriptions in filtering-based algorithm . . . . . . . . . . Entry of an event filtering-based routing table . . . . . . . . . . . . . . A data structure for subscriptions and its modelisation in C. . . . . . . . A data structure for notifications and its modelisation in C. . . . . . . . A data structure and its modelisation in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 25 25 26 28 29 34 34 35 35 35 36 36 39 40 41 43 44 46 46 47 47 4.1 4.2 4.1 4.2 4.3 A tile of three modules . . . . . . . . . . . . . . . . . . . . . . 3 × 1 × 3-tiled modular robot . . . . . . . . . . . . . . . . . . . Message count of flooding-based and filtering-based approaches. Unexpected behavior not always reproduced . . . . . . . . . . . Message overhead due to initialization and synchronization. . . . . . . . . . . . . 51 51 57 58 59 63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 5 7 8 List of Tables 4.1 4.2 6 different possible scenarios in the use of a publish/subscribe system. . . . 52 Space complexity of different algorithms’ implementation. . . . . . . . . . 60 64 References [1] M. Adler, Z. Ge, J. Kurose, D. Towsley, and S. Zabele. Channelisation problem in large scale data dissemination. In Ninth international conference on, volume Network protocols, 2001, pages 100–109, 2001. [2] M. Aguilera, R. Strom, D. Sturman, M. Astley, and T. Chandra. Matching events in a content-based subscription system. In Eighteenth annual ACM symposium on principles of distributed computing, pages 57–61, 1999. [3] E. Anceaume, A.K. Datta, M. Gradinariu, and G. Simon. Publish/subscribe scheme for mobile networks. In Proceedings of the 2002 workshop on principles of mobile computing (POMC 2002), pages 74–81, 2002. [4] ASE. http://modular.mmmi.sdu.dk/wiki/ase. URL http://modular.mmmi.sdu.dk/wiki/ASE. Open source control library for modular robots. [5] S. Baehni, C. Chhabra, and R. Guerraoui. Mobility friendly publish/subscribe. technical report 200488, EPFL, 2004. [6] R. Baldoni and A. Virgillito. Distributed Event Routing in Publish/Subscribe Communication Systems: a survey (revised version). Technical report, MIDLAB 1/2006 Dipartimento di Informatica e Sistemistica A.Rubert, Università di Roma la Sapienza, 2006. URL http://www.dis.uniroma1.it/ midlab. [7] R. Baldoni, R. Beraldi, G. Cugola, M. Migliavacca, and L. Querzoni. Structure-less content-based routing in mobile ad hoc networks. In In proceedings of the international conference on pervasive services (ICPS’05), Santorini, Greece, July 2005. [8] R. Baldoni, R. Beraldi, S.T. Piergiovanni, and A. Virgillito. On the modelling of the publish/subscribe communication systems. Concurrency and computation: Practice and experience, volume 17, pages 1471–1495. 2005. [9] R. Baldoni, C. Marchetti, A. Virgillito, and R. Vitenberg. Content-based publishsubscribe over sutructured overlay networks. In International conference on distributed computing systems (ICDCS 2005), 2005. [10] G. Banavar, T. Chandra, B. Mukherjee, J. Nagarajarao, R. Strom, and D. Sturman. An efficient multicst protocol for content-based publish/subscribe systems. In Proc. 19th IEEE International Conference on Distributed Computing Systems (1999), pages 262–272, 1999. [11] F. Cao and J.P. Singh. Efficient event routing in content-based publish-subscribe service networks. In 23rd conference on computer communcations (IEEE INFOCOM 2004i), 2004. 65 [12] F. Cao and J.P. Singh. Medym: Match early with dynamic multicast for content-based publish-subscribe networks. In Proceedings of the ACM/IFIP/USENIX 6th international middleware conference (Middleware 2005), 2005. [13] A. Carzaniga, D. Rosenblum, and A. Wolf. Design and evaluation of a wide-area notification service. In ACM transactions on computer systems 3, pages 332–383, August 2001. [14] A. Carzaniga, M.J. Rutherford, and A.L. Wolf. A routing scheme for content-based networking. In Proceedings of IEEE INFOCOM 2004, 2004. [15] M. Castro, P. Druschel, A. Kermarrec, and A. Rowston. Scribe: a large-scale and decentralized application-level multicast infrastructure. IEEE journal on selected areas in communications, 20, October 2002. [16] D.J. Christensen, D. Brandt, K. Stoy, and U. Pagh Schultz. A Unified Simulator for Self-Reconfigurable Robots. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’08), pages 870–876, Nice, France, September 22-26 2008. [17] M Cilia. An active functionnality service for open distributed heterogeneous environments. Phd thesis, Department of computer sciences, Darmstad University of Technology, 2002. [18] G. Costa, G.P. Picco, and S. Rossetto. Publish-subscribe on sensor networks: a semiprobabilistic approach. In Proceedings of the 2nd IEEE international conference on mobile ad hoc and sensor systems (MASS 2005), 2005. [19] P. Costa, M. Migliavacca, G. Picco, and G. Cugola. Introducing reliability in contentbased publish-subscribe through epidemic algorithms. In Proceedings of the 2nd international workshop on distributed event-based systems (DEBS’03), 2003. [20] G. Cugola and J.E.M: de Cote. On introducing location awarness in publish-subscribe middleware. In Proceeding of the International Workshop on Distributed Event-Based Systems (IDCS/DEB’05), 2005. [21] Y.K. Dalal and R.M. Metcalfe. Reverse path forwarding of broadcast packets, pages 1040–1048. Communication of ACM, 1978. [22] C. Diot, B. Levine, B. Lyles, H. Kassem, and D. Balensiefen. Deployement issues for the ip multicast service. IEEE network magazine, special issue on multicasting, 2000. [23] P.T. Eugster. Type-based publish-subscribe. 2004. [24] P.T. Eugster, R. Guerraoui, and C. Damm. On objects and events. In Proceedings of the conference on object oriented programming systems, languages and applications (OPPSLA), 2001. [25] T. Fukuda and S. Nakagawa. Dynamically reconfigurable robotic systems. In In proceedings of IEEE international conference on robotics and automation, volume 3, pages 1581–1586, Philadephia, P.A., 1988. [26] R.G. Gallager, P.A. Humblet, and P.M. Spira. Distributed algorithm for minimumweight spanning trees. CM Transactions on Programming Languages and Systems (TOPLAS), 5:66–77, 1983. 66 [27] R.F.M. Garcia, K. Støy, D.J. Christensen, and A. Lyder. A self-reconfigurable communication network for modular robots. In Proceedings of the First International Conference on Robot Communication and Coordination (ROBOCOMM’2007), Athens, Greece, October 2007. [28] R.F.M. Garcia, A. Lyder, D.J. Christensen, and K. Stoy. Reusable electronics and adaptable communication as implemented in the odin modular robot. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’2009), pages 1152–1158, Kobe, Japan, May 2009. [29] A. Gupta, O. Sahin, D Agrawal, and A.E. Addabi. Meghdoot: content-based publish/subscribe over p2p networks. In Proceedings of the ACMIFIP/USENIX 5th international middleware conference (Middleware’04), 2004. [30] C.P. Hall, A. Carzaniga, J. Rose, and A.L. Wolf. A content-based networking protocol for sensor networks. Technical Report CU-CS-979-04, Department of computer science, University of Colorado, 2004. [31] Y. Huang and H. Garcia-Molina. Publish/subscribe in a mobile environment. In Proceedings of the 2nd ACM International workshop on data engineering for wireless and mobile access (MobiDE), 2001. [32] S. Kale, E. Hawen, F. Cao, and J.P. Singh. Analysis and algorithm for content-based event matching. In Proceedings of the fourth international workshop on distributed event-based systems (DEBS’05), 2005. [33] G. Kremer, N. Melot, B. Pottier, and A. Plantec. Modeling mobility at process level for wireless sensor networks. 2010. [34] J. Legatheaux Martins, Senior Member, IEEE, and S. Duarte. Routing algorithms for content-based publish/subscribe systems. IEEE communications surveys and tutorials, 12(1), First quarter 2010. [35] A. Lyder, R.F.M. Garcia, and K. Støy. Genderless connection mechanism for modular robots introducing torque transmission between modules. In Workshop Modular Robots: State of the Art (ICRA 2010), pages 77–81, 2010. [36] K. Malek, G. Panduragan, and V.S. Anil Kumar. Distributed algorithms for contructing approximate minimum spanning trees in wireless sensors networks. IEEE transactions on parallel and distributed systems, 20:124–138, 2008. [37] R. Meier and V. Cahill. Steam: event-based middleware for wireless ad hoc networks. In Proceedings of the international workshop on distributed event-based systems (IDCS/DEB’02), 2002. [38] J. Moy. OSPF Protocol Analysis. RFC 1245 (Informational), 1991. [39] J Moy. MOSPF: Analysis and experience. FRC 1585 (Informational), 1994. [40] G. Mülh, L. Fiege, and F.C. Pietzuch. Distributed Event-based systems. Springer Verlag, Berlin, Germany, 2006. [41] B. Oki, , M. Pfluegel, A. Siegel, and D. Skeen. The information bus - an architecture for extensive distributed systems. In 1993 ACM symposium on operating systems principles, December 1993. [42] G.P. Picco and G. Costa. Semi-probabilistic publish/subscribe. In Proceedings of 25th IEEE International conference on distributed computing systems (ICDCS 2005), 2005. 67 [43] G.P. Picco, G. Cugola, and A.L. Murphy. Efficient content-based event dispatching in the presence of topological reconfiguration. In 23rd international conference in distributed computing systems (ICDCS 2003), pages 234–243, Providence, RI, USA, 19-22 May 2003. [44] P.R. Pietzuch and J.M. Bacon. Hermes: A distributed event-based middleware architecture. In 22nd International Conference on Distributed Computing Systems Workshops (ICDCSW ’02), 2002. [45] S. Ratnasamy, Handley M., R. Karp, and S Shenker. Application-level multicast using content-addressable networks. Lecture notes in Computer sciences 2233, 2001. [46] A. Rowstron and P. Drushchel. Pastry: Scalable, decentralised object location, and routing for large peer-to-peer systems. AFIP/ACM International conference on distributed systems platforms (Middleware 2001)., pages 329–350, 2001. [47] B. Segall, D. Arnold, J. Boot, M. Henderson, and T. Phelps. Content-based routing with elvin4. In Proceedings of AUUG2K, Canberra, Australia, June 2000. [48] T. Sivaharan, G. Blair, and G: Coulson. Green: A configurable and re-configurable publish/subscribe middleware for pervasive computing. In Proceedings of DOA 2005, 2005. [49] K. Støy, D. Brandt, and D.J. Christensen. Self-reconfigurable robots: an introduction. The MIT press, 2007. [50] A.S. Tanenbaum. Computer networks. Upper Saddle River, N.J. : Prentice Hall, 3rd edition, 1996. [51] T.G. Team. Achieving scalability and thoughpoyt in a publish/subscribe system (ibm research report rc23103). Technical report, IBM, 2004. [52] W.W. Terpstra, S. Behnel, Fiege L., A. Zeider, and A.P. Bushmann. A peer-to-peer approach to content-based publish/subscribe systems. In Proceedings of the 2nd international workshop on distributed event-based systems (DEBS’03), 2003. [53] Y. Wang, D. Qiu, G. Das Achlioptas, P. Larson, and H. Wang. Subscription partitioning and routing in content-based publish/subscribe networks. In 16th international symposium on distributed computeing (DISC’02), 2002. [54] M. Yim, Z. Ying, K. Roufas, D. Duff, and C. Eldershaw. Connecting and disconnecting for chain self-reconfiguration with polybot. IEEE/ASME transactions on mechatronics, 7(4):442–451, 2002. [55] E. Yoneki and J. Bacon. Content based routing with on-demand multicast. In Proceedings of the 24th IEEE international conference on distributed computing systems, workshop on wireless ad hoc networking (ICDCS - WWAN 2004), pages 788–793, 2004. [56] S.Q. Zhuang, B.Y. Zhao, A.D. Joseph, R. Katw, and J Kubiatowicz. Bayeux: An architecture for scalable and fault-tolerant wide area data dissemination. 11th Int. workshop on network and operating systems support for digital audio and video., 2001. 68