Scalability Study Regarding Project Madeira. Selection - Celtic-Plus

advertisement

MADEIRA

Design Document

Scalability Study Regarding Project Madeira.

Selection of Tools

MAD-WP7-0001-08-0

PROJECT CONFIDENTIAL

Ricardo Marin

Francoise Sailhan

Bertrand Baesjou

David Ortega

Markus Leitner

Date: 17/04/2020

UPC

Ericsson

Telefonica I+D

Telefonica I+D

Siemens

Pages: 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Table of Contents

Table of Contents .................................................................................................................. 2

Index of Tables ...................................................................................................................... 4

Index of Figures ..................................................................................................................... 5

0 Document Information .................................................................................................... 6

0.1

Document History .................................................................................................... 6

0.2

Keywords ................................................................................................................ 6

0.3

Glossary and Abbreviations ..................................................................................... 6

0.4

Purpose of the Document ........................................................................................ 6

0.5

Project Internal References ..................................................................................... 7

0.6

External References ................................................................................................ 7

0.7

Relationship to Other Documents ............................................................................ 8

0.8

Open Issues ............................................................................................................ 8

1 Executive Summary ....................................................................................................... 9

2 Background .................................................................................................................. 10

2.1

Scalability in P2P Networks ................................................................................... 10

2.1.1

Scalability Studies in P2P Networks ............................................................... 10

2.1.2

Performance and Scalability of JXTA ............................................................. 11

2.1.2.1.

Methodology ........................................................................................... 11

2.1.2.2.

Objectives ............................................................................................... 11

2.1.2.3.

Conclusions ............................................................................................ 12

2.2

Scalability in Web Services ................................................................................... 12

2.2.1

Performance for mobile Web Services ........................................................... 12

2.2.1.1.

Methodology ........................................................................................... 12

2.2.1.2.

Objectives ............................................................................................... 12

2.2.1.3.

Conclusions ............................................................................................ 12

2.2.2

Web Services security and load balancing in Grid environment ..................... 13

2.2.2.1.

Methodology ........................................................................................... 13

2.2.2.2.

Objectives ............................................................................................... 13

2.2.2.3.

Conclusions ............................................................................................ 13

2.2.3

Streaming validation model for SOAP Digital Signatures ................................ 13

2.2.3.1.

Methodology ........................................................................................... 13

2.2.3.2.

Objectives ............................................................................................... 13

2.2.3.3.

Conclusions ............................................................................................ 13

2.2.4

Hybrid server architecture for web applications .............................................. 14

2.2.4.1.

Methodology ........................................................................................... 14

2.2.4.2.

Objectives ............................................................................................... 14

2.2.4.3.

Conclusions ............................................................................................ 14

2.2.5

XML parsers ................................................................................................... 15

2.2.5.1.

Methodology ........................................................................................... 15

2.2.5.2.

Objectives ............................................................................................... 15

2.2.5.3.

Conclusions ............................................................................................ 15

2.2.6

Quick XML parser for WS digital signatures ................................................... 15

2.2.6.1.

Methodology ........................................................................................... 15

2.2.6.2.

Objectives ............................................................................................... 15

2.2.6.3.

Conclusions ............................................................................................ 15

3 Scalability Scenarios .................................................................................................... 16

3.1

Madeira Platform ................................................................................................... 16

3.2

Policy Based Management System ....................................................................... 17

3.2.1

Scenario ......................................................................................................... 17

3.2.2

Evaluation Criteria .......................................................................................... 18

3.2.3

Expected Results ........................................................................................... 18

3.3

Northbound Interface ............................................................................................. 18

Page 2 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

3.3.1

Scenario ......................................................................................................... 19

3.3.2

Evaluation Criteria .......................................................................................... 20

3.3.3

Expected Results ........................................................................................... 22

3.4

Configuration Management ................................................................................... 22

3.4.1

Evaluation Scenario ....................................................................................... 23

3.4.2

Evaluation Criteria .......................................................................................... 24

3.4.3

Expected Results ........................................................................................... 24

3.5

Fault Management ................................................................................................ 25

3.5.1

Scenario ......................................................................................................... 25

3.5.1.1.

Expected Bottlenecks .............................................................................. 26

3.5.2

Evaluation Criteria .......................................................................................... 27

3.5.3

Expected Results ........................................................................................... 27

4 Scalability Definition ..................................................................................................... 29

4.1

Communication ..................................................................................................... 29

4.2

Computation .......................................................................................................... 30

5 Analysis Tools .............................................................................................................. 31

5.1

Simulation Tools .................................................................................................... 31

5.1.1

Introduction .................................................................................................... 31

5.1.2

Requirements and Concerns .......................................................................... 31

5.1.2.1.

Implementation of the simulator and Madeira system .............................. 31

5.1.2.2.

Usability .................................................................................................. 32

5.1.2.3.

Networking model ................................................................................... 32

5.1.3

Simulator Selection ........................................................................................ 33

5.1.4

JSIM Simulator ............................................................................................... 33

5.1.4.1.

Simulator Implementation ........................................................................ 33

5.1.4.2.

Simulator usability ................................................................................... 34

5.1.4.3.

Networking Model ................................................................................... 35

5.1.5

NS Simulator .................................................................................................. 35

5.1.5.1.

Simulator Implementation ........................................................................ 35

5.1.5.2.

Simulator usability ................................................................................... 35

5.1.5.3.

Network Modelling ................................................................................... 36

5.1.6

Conclusion ..................................................................................................... 36

5.2

Emulation Tools .................................................................................................... 36

5.2.1

XEN ............................................................................................................... 36

5.2.1.1.

Introduction ............................................................................................. 36

5.2.1.2.

Preliminary Results ................................................................................. 37

5.2.1.3.

Conclusion .............................................................................................. 37

5.2.2

Alarm Generators ........................................................................................... 37

5.3

Large Test-Beds .................................................................................................... 37

5.3.1

PlanetLab ....................................................................................................... 37

5.4

Conclusion ............................................................................................................ 38

6 Appendix ...................................................................................................................... 40

6.1

Screen Captures of Simulation Tools .................................................................... 40

Page 3 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Index of Tables

Table 1: Document History .................................................................................................... 6

Table 2: Glossary and Abbreviations ..................................................................................... 6

Table 3: Project Internal References ..................................................................................... 7

Table 4: External References ................................................................................................ 7

Table 5: Open Issues ............................................................................................................ 8

Table 6. Characteristics of the JSIM and NS Simulator ....................................................... 34

Page 4 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Index of Figures

Figure 1 Scripting Editor ...................................................................................................... 40

Figure 2 JSIM Graphic editor ............................................................................................... 40

Figure 3 NAM graphic Editor ............................................................................................... 41

Page 5 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

0 Document Information

17/04/2020

MAD-WP7-0001-02-0

0.1 Document History

Issue

0.1

0.2

1.0

1.1

2.0

Date

21/09/2006

02/10/2006

05/10/2006

19/10/2006

25/10/2006

Table 1: Document History

Comments

Initial Issue

1 st Draft for Reviewing

1 st Version of the Document

Reviewed Version

Final Version

Editor

R.Marin

Ricardo Marin

Ricardo Marin

Ricardo Marin

Ricardo Marin

0.2 Keywords

MADEIRA, Network Management, P2P, Scalability.

0.3 Glossary and Abbreviations

Term

AC

AMC

CE

DMC

MDM

NBI

PBMS

PM

P2P

Explanation

Action Consumer

Adaptive Management Component

Condition Evaluator

Decision-Making Component

Madeira Distributed Management

North-Bound Interface

Policy-Based Management System

Policy Manager

Peer-to-Peer

Table 2: Glossary and Abbreviations

0.4 Purpose of the Document

The purpose of this document is to present the scalability assessment in the project Madeira.

After presentin g the state of the art, the term ‘scalability’ is defined in the context of the project per component and also in a more general view. Finally, the tools that will be used during the scalability study are presented.

Page 6 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

0.5 Project Internal References

17/04/2020

MAD-WP7-0001-02-0

Short Code

[ARCH_REQ]

[SCEN]

[MODEL_REQ]

[MAD-WP4]

[MODEL]

Document Reference

Madeira, Architecture & Interfaces, Requirement Specification; MAD-WP1-

REQ-0001-00-C.

Scenario Description for the Madeira Project, MAD-WP4:0001 Ver. D1.03

Candidate Architecture for Adaptive Management Component, Requirement

Specification, MAD-WP2-REQ-0001-02-I

Design document Scenario description for the Madeira Project, MAD-WP4-

0001

Madeira Modelling Approach, MAD-WP3-DD-0003-07-I

Table 3: Project Internal References

0.6 External References

Short Code

[BURG]

[CYGWIN]

[JUAN]

[JXTA1]

[JXTA2]

[JXTA3]

[JXTA4]

[JXTA5]

Document Reference

Burgess, M. and Canright, G; Scalability of Peer Configuration Management in Partially Reliable and Ad Hoc Networks in IFIP/IEEE Eighth International

Symposium on Integrated Network Management, 2003. pp 293- 305; ISBN 1-

4020-7418-2 http://www.cygwin.com/

Li, Juan; Vuong, Son; An Efficient Clustered Architecture for P2P Networks ;

In Proceedings of the 18 th International Conference on Advance Information

Networking and Appl ication (AINA ’04)

M. Jan, Large scale (preliminary) experimental evaluation of JXTA on

Grid’5000 , PARIS Research Group, Grid’5000 Spring School, March 2006

E. Halepovic, R. Deters, The JXTA performance model and evaluation , Future

Generations Computer Systems 21, Special Issue: P2P computing and interaction with grids, 2005, pp. 377-390

E. Halepovic, R. Deters, The costs of using JXTA , Proceedings of the P2P’02,

Linköping, Sweden, 2002, pp. 41-48

JXTA Bench Project, http://bench.jxta.org

JXTA Project, http://www.jxta.org

[LO] Virginia Lo et al.; Scalable Supernode Selection in Peer-to-Peer

Overlay Networks ; In Proceedings of the 2005 2 nd International Workshop on Hot Topics in Peer-to-Peer Systems (HOTP2P’05) http://www.isi.edu/nsnam/nam/ [NAM]

[NS]

[OTCL]

NS home page: http://www.isi.edu/nsnam/ns/ http://bmrc.berkeley.edu/research/cmt/cmtdoc/otcl/index.html

[PLANETLAB] http://www.planet-lab.org

[SCA]

[WS1]

[WS2]

[WS3]

[WS4]

[WS5]

[XEN]

[XEN-Wiki]

[XEN-Guide]

[XERCES]

A. B. Bondi, Characteristics of scalability and their impact on performance , Proceedings of the 2nd international workshop on Software and performance, Ottawa, Ontario, Canada, 2000. http://www.sics.se/~thierno/aswn2003.pdf http://www.extreme.indiana.edu/xpola/wsslbgrids.pdf

http://www.cs.indiana.edu/~welu/c14n_hpdc05.pdf http://www.bsc.es/publications/deepcomputing/dc3018Barcelona_eDragon

_HPPC06.pdf

http://vtd-xml.sourceforge.net/benchmark.html http://www.xensource.com http://wiki.xensource.com https://bscw.celtic-madeira.org/bscw/bscw.cgi/d14595/XEN-guide.doc http://portal.acm.org/citation.cfm?id=1104760.1104942

Table 4: External References

Page 7 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

0.7 Relationship to Other Documents

17/04/2020

MAD-WP7-0001-02-0

0.8 Open Issues

Number Description

1 Scalability Test Planning

2

Table 5: Open Issues

Page 8 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

1 Executive Summary

In this document we are focussing on the scalability assessment of the project Madeira. First of all, different related projects that made a scalability study too are presented. After that, we show different tools and procedures that we will probably use when making our scalability assessment. Later, we define the most important scalability issues related to the project, starting in a per-component basis and ending by in a general definition of scalability focussing on communication and computation issues. Finally, we also propose some scenarios that were thought regarding the main focus of this Work Package.

Page 9 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

2 Background

In the context of the Celtic project Madeira we have considered that it is important to study the scalability of the proposed solution. The main reason to justify our interest is that at the moment of designing our solution, we were thinking on managing large Peer-to-Peer (P2P) networks. So, it is important for us to state that our solution can scale to manage them.

In order to focus our study, we start with a survey on scalability of relevant projects in order to find the most important ones that have studied the scalability of similar systems. The information that we can extract from these projects can be used to guide our studies.

Additionally, we can compare their results to ours and therefore may help to decide if the scaling behaviour of Madeira is satisfying or not.

2.1 Scalability in P2P Networks

2.1.1 Scalability Studies in P2P Networks

P2P networks are very attractive as they offer some features as self-organization, availability, and fault tolerance. However, the use of this kind of networks proposes some challenges regarding its management. It is also important to see how these solutions scale when the number of nodes that need to be managed grows.

In [LO] the authors present a survey on three different techniques for choosing a super-peer.

The problem they are focussing on is the selection of a large number of super-peers from a huge and dynamically changing network in which neither the node characteristics nor the network topology are known. They conclude their survey saying that all the algorithms for selecting super-peers that they have presented cope with the scalability issues, because this supposes the use of a logical hierarchy that facilitates the management tasks.

In [JUAN], the authors present a solution for the architecture of P2P Networks based on

Clusters and the election of super-peers in order to make this kind of networks more efficient and scalable. They compare their solution with P2P networks that are completely distributed, decentralized and unstructured. They conclude that their proposal of a hierarchical-based super-peer structure is the more advisable one in order to achieve good performance and scalability. They base this statement on the tests that they have done in simulators and using a real test-bed composed of 16 nodes.

In [BURG] the authors present several models for configuration management on Peer-to-

Peer networks basing them on the basis that this kind of networks should not be centralised and are dynamic and heterogeneous. They analyze their proposal by assessing the scale behaviour of the models in respect of policy distribution and policy enforcement. The study concludes saying that the more centralized the less the solution scales. Moreover, if they decentralized one of the two criteria, the network starts scaling better. However, the decentralization brings new problems like trust or convergence to a stable regime.

After presenting these scalability studies related to Peer-to-Peer networks, we can conclude that our solution should scale well, as all the presented studies say that one of the requirements is not to be completely decentralized. In the case of Madeira, the existence of

Super-Peers plays the role of establishing a logical hierarchy for network management.

Unfortunately, another conclusion that can be extracted from these studies is that they don’t propose any procedure to make the scalability assessment of a Peer-to-Peer network. In fact we conclude that no one has practically studied the scalability of a clustered Peer-to-Peer

Network. At least, we have not found any related paper.

Page 10 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

2.1.2 Performance and Scalability of JXTA

This section provides a short summary about activities regarding performance and scalability analysis done for JXTA [JXTA5]. Unfortunately, their focus was quite different to that planned for Madeira, so we cannot learn too much from them. However, these sections try to highlight everything that may be adapted for Madeira or where similar ideas could be applied. While most of the following can be found in [JXTA2], recently an attempt to do some large-scale test of JXTA using Grid’5000 has been done by the PARIS research group. Unfortunately, only a preliminary presentation is available [JXTA1], which does not include a detailed description.

2.1.2.1. Methodology

The JXTA community has initiated a dedicated subproject called “Bench” [JXTA4] to analyse performance and scalability of JXTA. Unfortunately, most of the work done so far puts its emphasis on performance relevant measurements (like the performance of pipes) to determine progress or regression over different releases of JXTA. Currently only measurements gained using a real test bed have been done while large-scale simulation as well as theoretical analysis seems to be missing. However, they tried to get some general conclusions that also address scalability from their tests. Even if the focus of Madeira is quite different, as it has been designed to accomplish tasks within the specific field of network management while JXTA is a general purpose P2P middleware some of the ideas used by the JXTA community might be interesting for the evaluation of the Madeira platform which is a P2P middleware too.

2.1.2.2. Objectives

Before summarizing the JXTA performance model as proposed in [JXTA3] we need to give a short summary about the architecture of a JXTA network.

JXTA is based on the idea of each peer providing specific services (e.g. files). These services are published by so called advertisements. Each peer (“ edge peer ”) publishes its advertisements by sending a list to its rendezvous peer (these are conceptually similar to super – peers) that maintains an index of all advertisements of peers it is responsible for (i.e. all edge peers in its peer group). Finally relay peers act as intermediaries to provide connectivity over firewalls and NATs. Similar to Madeira, JXTA peers can communicate either stateless via messages or connection oriented via JXTA Pipes which can be seen as an analogous to the P2PPort within Madeira.

Based on this design it is quite obvious that the overall performance and scalability of JXTA heavily depends on rendezvous and relay peers, as they may become bottlenecks with increasing traffic.

Therefore the proposed JXTA performance model [JXTA3] consists of the following metrics:

Latency of typical peer operations (e. g. start-up, join group)

Pipe message round-trip time

Pipe message and data throughput

Rendezvous query response time, throughput and reliability

Relay message throughput

Based on these metrics, several measurements have been conducted in [JXTA2] and the results compare JXTA version 1.0 with version 2.0. For the first three metrics tests include simple experiments analysing throughput and overhead of various aspects of the JXTA implementation but also some reliability considerations such as data loss. It is quite obvious that similar measurements can be done for the different communication facilities provided by the Madeira platform too.

Page 11 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

For rendezvous peers the JXTA community mainly analysed the influence of different peer group sizes to average response times to a query. Naturally, for Madeira we might analyse the influence of changes in the maximum amount of nodes per cluster to the overall performance and scalability of Madeira.

2.1.2.3. Conclusions

The main conclusions drew in [JXTA2] address the design of P2P systems based on JXTA.

An overview that communication facilities might perform best in specific situations that might be of great interest for the designers of future P2P systems based on JXTA is provided.

Regarding rendezvous query performance, which might be the most interesting point for

Madeira it has been observed that JXTA version 2.0 scales well by means of group size but has a remarkable high delay even for small peer groups. In principle they concluded that the

JXTA project is making a good progress by means of performance and scalability even when there are several metrics where the old version has been better. Secondly, they are aware that their picture is not complete especially when dealing with scalability.

2.2 Scalability in Web Services

2.2.1 Performance for mobile Web Services

The paper Performance Considerations for Mobile Web Services [WS1] focuses mainly on reducing the size of messages send by web services by compressing XML data. Due to the nature of XML, these messages are much larger than traditional transaction messages. It is stated that a typical web service needs to send four to five times as many bytes as what the original content of the message is. It must be noted that in this research they use a normal web server that was accessed by mobile clients. In the Madeira project this is just the other way around, where the web service itself resides on a mobile client.

2.2.1.1. Methodology

The study focuses on the trade off between CPU time and the time necessarily to send the

XML data. However, this document mainly focuses on web services running on static networks that are being polled by mobile devices. The web service (with much processing power) compresses the data, and the mobile client has only to decompress it. It is however considered that the client can compress the response as well however this is not elaborated in the paper.

In the study they look at how much time it costs to perform compression on XML in combination with the time it saves by only having to send this smaller package over the connection. This is called the overall response time. However the study does not look into depth into the CPU cycles it costs to perform this compression. It only notices that it costs much more cycles and as solution they state that when the processor is being overloaded, the solution will be dynamically able to drop compression.

2.2.1.2. Objectives

As said, the main focus of the study was on lightweight mobile clients, utilizing the web service from a (heavyweight) fixed web service enabling response-time speedups by allowing the transferred data to be compressed by the server.

2.2.1.3. Conclusions

The study shows that when there are enough free CPU cycles, compressing by default has only positive effects. The time it takes to do a request and get a response (response time) does not increase, but is the same as non-compressed on fast (Bluetooth and faster) links and faster on slow (GPRS and slower) links. For Madeira, it shows that there is a high probability that compression might save us some time on really slow links, but is likely to load our web service quite heavily.

Page 12 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

2.2.2 Web Services security and load balancing in Grid environment

In this study [WS2] they try to deal with the slow message level security of web services by applying a grid computing environment paradigm. The proposed solution to this is to break up the processing of message security and the processing of the actual content of the message.

2.2.2.1. Methodology

Their scalability focus was on distributing the load of one web service over multiple hosts by breaking up processing steps. The study looked at the response time of messages and the number of web service requests that could be made.

2.2.2.2. Objectives

An edge node works as dispatcher to send the web service request to a security node that will handle the entire web services security context, after which it forwards the actual message internally to a computer in the grid environment. The study also uses a duplexcommunicating scheduling scheme (the node informs the dispatcher/security node of its current load) to distribute the load evenly across web servers in the grid and provides the optional use of SSL/TLS to secure communication between grid nodes. However, the paper states that processing the secure context is in most cases just as CPU intensive as the actual web services request and therefore just as many security context nodes as web services nodes are required. In terms of response time and the number of requests that can be made to the web service, the provided solution performs well.

2.2.2.3. Conclusions

The study concludes that splitting up processing tasks and responsibilities is a good way to increase the scalability of a web service. It is however suggested to internally communicate between the nodes in binary messages instead of XML to gain even more performance.

For Madeira the solution presented in this paper might not be feasible, because the grid network used in this paper is more static and reliable than the Madeira P2P network.

However, it might be feasible to enable more than one NBI in the Madeira network and use an edge node to be a web service dispatcher (WSD) to dispatch to these nodes, thus using the grid paradigm.

2.2.3 Streaming validation model for SOAP Digital Signatures

This study [WS3] focuses mainly on security aspects, which are covered by WP6. Therefore, we will not elaborate this study extensively. However, since this study might be relevant we will summarize it in short.

2.2.3.1. Methodology

In this paper there is a focus on the performance of validating large digital signature signed

SOAP messages. It is stated that due to the design of XML, validating a message is a process that often lead to scalability and performance issues in terms of the number of messages that can be processed and the CPU cycles it needs to validate a message.

2.2.3.2. Objectives

The study presents an improved validation design and implementation system (GHPX/SSSV) that uses a streaming process for validating SOAP messages.

2.2.3.3. Conclusions

The study found that there is a noticeable improvement in system performance, scalability and memory efficiency. Since Madeira will want to use signed messages in their security model, this study will be important when web services security scalability is researched.

Page 13 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

2.2.4 Hybrid server architecture for web applications

This study [WS4] focuses on combining known web servers designs into a web server that houses the positive aspects of both designs, getting to an overall better performing web server and thus also better performing web services on top of this server.

2.2.4.1. Methodology

For web servers there are two major basic designs, (i) threaded where every connection is assigned to a thread, and (ii) event driven where all the client requests are assigned to a worker thread from a worker thread pool which handles the request after which the thread returns to the pool again. The threaded variant is actually able to hold up a connection, where the event driven has not really an association between server and client.

The threaded variant is well suited for short lived persistent associations between server and client, but when the time between requests becomes to large, this model requires a lot of resources for just being idle such as blocking I/O operations on sockets. However, terminating inactive threads for a server under a lot of load is commonly a bad idea because a lot of clients will not be able to finish their session and will have to connect again, causing a sort of domino effect. Especially in combination with SSL/TLS this effect can be amplified because TLS needs a lot of CPU cycles in the session establishment phase.

With the event based variant, no sockets are blocked as long as there is no information being processed, because every request to the server is handled by a worker thread which handles the request and then returns to the worker thread pool to register itself as being available again. Therefore no unnecessary resources are kept occupied for idle processes. One thread is in charge of accepting new connections and registering them to the channel selector where another thread, the so-called request dispatcher, will wait for socket activity.

Since the channel selector keeps track of the connection, it enables this model to not unnecessarily close connections that is a great performance gain for HTTP/TLS connections.

A remarkable characteristic of this model is that there is no “natural” way of controlling the number of connected clients. Where with the multi-thread based model the number of threads is the limit, here some sort of admission control is needed to be able to cope with too much connections.

2.2.4.2. Objectives

In the study a hybrid server is proposed which takes the best characteristics of both server architectures. For every connection a thread is assigned which handles the connection management and processing of requests. However the acceptor thread and request dispatcher from the event-based model are preserved, which handles incoming requests and signals the associated thread. In this way the thread itself does not need to perform I/O blocking operations on the socket.

2.2.4.3. Conclusions

Results of tests placed forward in the study show that this hybrid approach is very promising and provide even large performance increases for just a single or a few connected clients and being able to significantly outperform the multi-threaded model when many clients are connected. This performance is measured in the metrics of reply rate, response time, session lifetime, number of timeout errors and session throughput. Especially for session based web servers with encryption this solution really stands out.

This model could very well be applicable on for the NBI within the Madeira project since most of the communication between the NBI and OSS is session based.

Page 14 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

2.2.5 XML parsers

This “state of the art” part is not based on any paper or academic studies, but on reading various websites and taking one of those (more recent) websites [WS5]. This study is scientifically not valid, but it offers a valid discussion about the status of XML parsers.

2.2.5.1. Methodology

There are not many articles about XML parsers and their performance. Since parsers are one of the main components of web services one would have expected that there would be more research in this area. Especially in the differences between parsers, their architectures, their performances and what could be done to improve the overall quality of parsers.

2.2.5.2. Objectives

There are a lot (more than 15) of widely XML parsers out there, which all have their positive and negative properties. Even though there is no scientific data available on different parsers, there are recent benchmarks between parsers. Note that the referenced benchmark is hosted on a site that offers a certain (Open Source) parser, so they have probably interest in showing figures that show how good their parser is. Even tough, it shows that for certain types of XML parsing particular parsers deliver a higher performance than others. In this particular benchmark there is a factor three to four speed increase between the advertised parser and the parser currently used in Madeira, while there is a factor for memory reduction.

2.2.5.3. Conclusions

We have seen (out-dated) XML parsing benchmarks on many other websites with similar results. The older XML parsing benchmarking also show that there is a lot of difference between the different types of parsers. This strengthens the assumption that it is, from a performance point of view, necessarily to look if there is a modern parser on “the market” that performs optimally for the type of XML messages used in the web service. And even though if one finds an optimal parser, it still might be worthwhile to research if the parser can be even more optimized and customized for ones own needs, especially since parsing is responsible for a large part of the web services CPU cycles consumption. Therefore if there is really need to trim down the NBI, optimizing in parser technology could be a reasonable option.

2.2.6 Quick XML parser for WS digital signatures

Just like the previous study, this study [XERCES] is mostly related to the scalability of secure web services and therefore is more closely related to WP6. However, this paper points out once again that there is a lot of optimizing possible in the area of XML parsers.

2.2.6.1. Methodology

Again this paper focuses on the validating of digital signatures of XML messages. They introduced a custom build parser, called QXP (Quick XML Parser), based on Xerces (the parser also used in the Madeira project to parse XML massages). This parser parses the received XML document directly into the required byte array, instead of the normal steps of firstly building a DOM tree and secondly making it canonical and normalizing this DOM tree into the required byte array.

2.2.6.2. Objectives

The metrics they used was only in the form of CPU speedup and no reference to memory usage was made. They studied 4KB messages for both to DSA-SHA1 and RSA-SHA1 algorithm, signed by X.509 certificates.

2.2.6.3. Conclusions

The study found that there was a 7 to 21 time speed increase, comparing to the Xerces parser. Since their engine is based on the Xerces engine, and Madeira uses the Xerces parser also, it is worthwhile to look into this when the web services security context needs a performance improvement.

Page 15 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

3 Scalability Scenarios

At the moment of making the scalability study in the context of the Madeira project, we need to define what scalability in our environment is. In order to make this definition, we are going to follow a bottom-up approach, where we are going to identify what we consider to be the more critical points or bottlenecks in every component. After that, we will be ready for providing a definition. In the following sections, we can find the definition of the bottleneck in every main component of the system that will lead us to section 4, where the general definition for scalability in the context of this project can be found.

3.1 Madeira Platform

The Madeira platform is constituted of a set of services that corresponds to:

Notification Service – this is an event notification service based on the publication subscribe paradigm.

Directory Service - this is a directory of Madeira enabled Nodes (MN) that allows

MNs to be looked up.

Connectivity Service – this service provides reliable one to one and multi-hops connectivity between two MNs.

Persistency Service – this is a local persistence service per Madeira Node. It allows

Application Modules to persist data to storage for retrieval across restarts.

Grouping Service – this service allows the dynamic formation of Madeira Node groups/clusters.

Lifecycle Service – this service provides start/stop/restart operations on all

Application Modules loaded into the Madeira node.

Code Distribution Service

– this service is responsible for supporting the dynamic load of application logic/data or NE specific adapters into the Madeira Node.

Time Service – this service provides a network time synchronisation service, for use in alarm/fault correlation scenarios, and to timestamp data where necessary.

The Notification, Directory, and Grouping services and the Code Distribution service are some platform services that are required by the Configuration Management. The study of the scalability of those services is for this reason provided in the Section 3.4, which presents the scalability study of the CM. We do not plan to assess the scalability of the Lifecycle service.

The reason is that this service is not critical if we consider scalability aspects.

From a Fault Management point of view, the scalability of all platform services involved in communication between peers should be investigated, as this can be considered as a base for further scalability investigations on application level. These are primarily Notification,

Directory and Grouping Service. Further, Connectivity Service, Code Distribution Service and

Time Service might also be relevant, but for this purpose, no additional simulations might be required. Finally Persistency Service has already been shown to cause severe performance limitations in its current implementation, and is currently not being used within the FM application. Apart from that implementation level problem, this service is not considered as critical with respect to scalability from a conceptual point of view.

Page 16 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Focussing on the services used by the Policy Based Management System, we only foresee some bottleneck when using the Persistency Service. We consider that the delays introduced by this component at the moment of either reading or writing some values to it when evaluating the conditions for the Notification Correlation and Alarm Formatting can be critical if the number of evaluation grows. The rest of the services used by the PBMS - Notification and Lifecycle Services - are not considered critical regarding the scalability.

3.2 Policy Based Management System

The Policy Based Management System is the component in charge of managing the policies that rule the system performance, monitoring the conditions and enforcing the actions. Both conditions and actions have been previously defined in the policies themselves.

3.2.1 Scenario

Having in mind the expected functionality that has been previously defined in the documents and deliverables of the Madeira project, we can foresee that the critical points could be:

Number of policies supported by the system. The system is evaluating all the active policies at the same time. So, the more policies there are, the more evaluating processes the system will have active. However, we should note that this is not strictly true, as different policies can be based on the same conditions, so no more evaluators would be needed. But in general terms, it is true.

Number of conditions being monitored. This is related to the previous bottleneck.

The main difference is that it is possible that the system has only a few policies active but every policy is evaluating a lot of conditions. In this case, all the evaluated values need to be monitored.

Evaluation of conditions (accessing to databases, asking for current values of parameters, etc.)

Regarding a possible scenario, we think that the number of nodes is not important regarding the number of policies in the system. On the other hand, the number of nodes can influence the number of notifications or alarms that are received by the system. In that case, we can say that the relationship between the number of nodes and the scalability of our system is the number of alarms or notifications received in every node.

It is important to remark the difference between node and clients. The scenario that we foresee is a wide area where we want to provide Internet connectivity. In this area, the number of nodes is going to be limited (hundreds of nodes). On the other hand, the number of clients can grow considerably.

In order to evaluate the scalability of the PBMS, we think that is not necessary to use a largescale test bed. We consider that the appropriate software can generate the alarms or notifications needed for the scalability study. So, we can even make tests using a unique node. Nevertheless, it will be interesting to make some tests with several nodes, just to see if we can assume that using one node is similar to the real network.

Therefore, we propose to make the scalability study by means of our own test-bed. We also propose to use emulation to make the study related to notifications or alarms: we can generate them using a dummy generator.

It is also possible to use simulations to study the traffic generated at sending notifications to access the Persistency Service, or communicating with other nodes, when introducing new policies.

Page 17 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Finally we think that the usage of a large-scale test bed that allows testing the performance of the system using real nodes would be a good idea to complete our study. Moreover, we would also propose to support the practical studies about scalability with a theoretical study.

On the other hand, it may require a great effort to create a theoretical model the performance of the PBMS, although this theoretical model would also benefit the studies made using the simulator.

3.2.2 Evaluation Criteria

Making use of the previously defined scenario, where the important things are both the number of policies and the number of alarms or notifications that are received by the PBMS, we propose to study the following metrics:

Response time at introducing a new policy.

Response time at evaluating a new notification / correlation.

Response time at enforcing an action.

We propose to study these metrics taking time measures by the methods provided by the platform and also make use of a profiler. For every of the previous parameters we propose to make the study taking as a reference the following parameters

Number of policies being evaluated: We propose to increase the number of policies being processed and studying the performance of the system varying it. In this way, we can study both the performance of the Manager and also the performance of the Condition Evaluators as the number of policies grows.

Number of different notifications being evaluated.

Number of nodes in the network: We propose to increase step by step the number of nodes in the network, having in mind increasing the number of neighbours.

However, we don’t expect this parameter to be the important one in the scalability study of the PBMS.

3.2.3 Expected Results

In general terms, we expect the PBMS to scale well: the response time when processing policies, as well as the time required for processing an alarm or a notification by the

Condition Evaluators and the enforcement of the actions by Action Consumer are not constraining the whole system performance, even for a large number of nodes, policies or alarms to be processed. Anyway, we expect to find out which are the main problems when scaling the PBMS. There are some critical points, like accessing the database, which can create some problems when scaling the system. However, although they are not strictly part of the PBMS, we will evaluate them, as without these interactions the PBMS wouldn’t be able to carry out its work.

3.3 Northbound Interface

The Northbound Interface (NBI) is the component that handles the communication between

Madeira and external systems. It provides, to any external OSS, a set of operations, offered as Web Services, by which that external entity can:

Get configuration and fault management information flowing from Madeira to the OSS as well as notifications (CM events and FM alarms) and as responses to requests

(e.g. GetTopology)

Execute commands within the Madeira Network such as: insert/get/remove policies, disable nodes, …

Page 18 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

To be able to offer those services, the NBI receives FM alarms and CM events from the underlying Madeira platform FM and CM modules and transforms them into XML messages, which are subsequently sent to previously subscribed external OSSs. In addition, it collects external web services requests and commands, and performs the appropriate operations into

Madeira to run them.

3.3.1 Scenario

Analysing the NBI, we can foresee that the critical points could be:

Number of nodes managed by Madeira

When an OSS does a topology request (getTopology or getPhysicalTopology method invocation) to receive information about the (physical or logical) topology of the Madeira network, there might be a critical point in the number of nodes and the complexity of the network. It is therefore important to study the resources that the NBI requires to build a request to some of the web service operations it offers to the OSS and also to study the amount of resources it needs to transform the information provided by Madeira to the XML format used in the web service response.

It will also influence the size of the data structures that the NBI will have to transform from MADEIRA dataTypes to XML Web Services responses. Therefore we should look at how to minimize the time, in CPU cycles, and memory it takes to create the response versus the size of the Madeira network. Components that require special attention are the XML parser and the code that creates the network topology object.

To conclude, the resources that the CM will require to build and send the topology data to the NBI will be affected by the number of nodes that have to be processed by it.

Number of alarms/events generated by the applications: FM and CM.

The actual number of nodes does not seem to be very important since a single node can produce many events and alarms. Therefore we should investigate how the service of the NBI performs if it receives many alarms and notifications by the underlying FM and CM applications.

Alarm size

This parameter will have influence in the time and resources that the NBI needs to parse actual alarms or events (generated in the FM or CM application) to XML data before sending them as notifications to the external systems. Both, the number of fields of the alarms and the size of the fields of the alarms should be investigated.

Number of OSS subscribed to receive notifications.

In combination with the reception of alarms and notifications, we should also look at the performance of the NBI node in combination with the number of OSSs subscribed to receive notifications.

Number of Web Services request at a certain time.

It is interesting to research how many resources a WS request consumes and how these requests scale along with the number of connected OSSs.

Page 19 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

3.3.2 Evaluation Criteria

17/04/2020

MAD-WP7-0001-02-0

To be able to research scalability aspects of the NBI, it is needed that a correct context is placed around the research, and that expected goals and outcomes are clear. Therefore, in this chapter, the performance metrics and research contexts will be defined in order to be able to define the expected results.

The following parameters could have influence in the NBI performance:

Network size : measured in terms of the number of nodes and cluster hierarchy of the Madeira Network.

Number of FM alarms and CM events : Metrics which can be used to study the impact of these parameters are:

Number of OSS subscribed to any alarm: We need to find out what the impact on the system the number of alarms we have to send is via

Pubscribe. Especially with many generated alarms and many connected

OSSs. We should look at overall NBI system performance (CPU cycles, memory usage and network activity).

Incoming alarm rate: Number of alarms which arrives at the NBI per second or minute (from the CM or FM). This is closely related to the previous point, because these incoming messages actually trigger the NBI to do something, therefore we sho uld again look at the entire “chain” of the message, from incoming to outgoing. Where are the possible bottlenecks and what would a feasible solution be.

Alarm Size : As stated in the previous section, metrics which could be used to study this parameter are:

Number of fields of the alarms.

Size of the fields of the alarms (expressed in bytes)

Web service requests : The number of WS requests the NBI can handle

There might be several OSSs connected which all make requests at the web service. We are in this case not interested in how long it takes to create the actual response, but mainly in how long it takes to receive the request and process it to a level the web service actually “understands” the request and is bound to call the appropriate modules to generate a reply or send the request/command to the Madeira system. We would need to look at what probably the “heavy” parts of the system are in terms of CPU cycles and memory usage of the different components.

We also could look at the overall response time of request and reply from the system, thus including all components. However this is a sort of cross breed study between all the previous listed research subjects.

Based on the described scenarios and identified parameters, the scalability study in the NBI could be done by means of the next methods:

The use of simulations to study the traffic generated towards the NBI (alarms and web service requests) and from the NBI (Pubscribe notifications and web service responses) while measuring the memory and CPU cycles needed to process such load. This study should show in what way the NBI is scalable, for example is there a

Page 20 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0 linear of logarithmic increase in load when the requests increase linear. It will also show what the heavier (more CPU and memory costly) components and processes are. In addition, it could be useful to study the influence of the incoming alarms in the

NBI when it runs alone. It means, alarms and events generated by the applications

(FM and CM) travel around several Madeira modules: the application that produces the alarm, notification service, correlation functionalities and so on… This adds potential bottlenecks outside the NBI, which can introduce some confusion at the time of evaluating the scalability problems in the NBI. Therefore we should keep asking ourselves if the problem is in the NBI or in other applications.

To study the bottlenecks depending on the size of the network it could be helpful to be able to use large-scale test-beds. If those are however not available, we could use emulation or perhaps implementing dummy components, which allow us to build topologies, generate FM alarms or CM events and providing them to the NBI (in our own test-bed).

It is possible we find some memory problems while processing large data types (e.g. topologies) in the NBI so, some profiling tool, for example Jprofiler, could be used to detect computation bottlenecks, in an attempt to improve the NBI general performance. Especially the XML parser should be tested, since the state of the art studies show that there might be a bottleneck or at least a chance to improve performance.

We could look at the parsing/handling of WS requests by the NBI by looking at the different components used, and within these components especially at the XML parser, by “firing” many WS requests at a NBI set up especially for this purpose. This should give us insights into how WS requests themselves scale within the NBI.

It could be also interesting to study how the NBI has effect on the rest of the system, it means, how MADEIRA is affected by having to serve petitions from external systems. What happens for example to the Madeira system when the NBI is under heavy load.

The influence of the previously defined parameters on some important metrics should be analysed. Relevant metrics already identified are as follows (note there could be some cross-dependencies among them):

Response time as a function

Notification handling time

Maximum supported alarm rate

Maximum number of web services requests per time

Hardware resources (memory and CPU) usage

It must be highlighted that most of these metrics depend on the hardware platform

Madeira is running on. It is not clear at this stage whether different platforms will be available for scalability testing but, at least, some recommendations on minimum hardware requirements would be delivered.

In addition, some theoretical analysis dealing with the resources the NBI needs to work properly could be done. From our point of view the most interesting studies could be:

To identify the amount of resources (in terms of memory and CPU cycles) the

NBI needs to process topology data depending on the number of nodes of the managed network.

To identify the resources needed to treat the Pubscribe notifications depending on the alarms rate and size.

Page 21 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

3.3.3 Expected Results

17/04/2020

MAD-WP7-0001-02-0

The proposed studies should provide us insight into which parts of the system provide which kind of performance impact, and therefore we might be able to find and describe bottlenecks.

Following our findings we might try to suggest or research possible solutions for solving or reducing the impact of the found bottlenecks based on known paradigms as described in the state of the art part, or by introducing new approaches to solve the problems.

According to the Madeira architecture, NBI functionality resides on the top-level cluster-head that is the node at the top of the clustering hierarchy. Clearly this can be seen as a potential bottleneck, since all communications with external systems must pass through this node.

Therefore, methods to alleviate this situation may be explored as, for example, distributing

NBI functionality among several nodes.

Finally a number of solutions addressing the identified bottlenecks should be provided and, according to available time and resources, some of them implemented.

3.4 Configuration Management

The objective pursued by the CM application lies in (i) providing up-to-date topological information in the network, (ii) allowing applications to subscribe to topological events, and

(iii) allowing applications to carry out some configuration operations. To manage topological information, the CM carries basic configuration and management tasks such as:

Reading and monitoring data and events coming from lower level services,

Handling subscriptions to topological events,

Publishing topological information to higher level services.

To perform the above tasks, the CM relies mainly on four basic platform services: the OLSR service, event notification service, directory service and grouping service. Note that those services also intend to be used by any kind of component of Madeira system or application.

The central character of those services motivates us for providing detailed assessment of their scalability. In the following, we provide a brief description of each of these services in turn, along with a discussion concerning their ability to scale.

The proposed grouping service arranges the Madeira nodes into a set of clusters that are interconnected through some cluster heads in a hierarchical manner. Note that the resulting grouping structure automatically reconfigures itself when confronted to network changes

(e.g., cluster head or node failure). This hierarchical and cluster-based grouping is less likely to experience performance bottleneck compared with centralized approaches and with regard to an increase of scales. This structure also tends to suffer less in the presence of failures due to the distribution of the configuration and management functionalities among the cluster heads. Finally, it exhibits convenient characteristics (e.g., a weak but sufficient concentration of the traffic on cluster head) that are required for aggregating or correlating event notifications and indirectly reducing the traffic generated.

The directory service is likely scalable since it keeps track of information related to the physical topology of its neighbouring nodes rather that from all the entities (e.g., Madeira nodes) deployed over the overall network. This allows reducing both the storage capacities required and the number of messages necessary to keep up to date this information.

The notification service is based on the grouping service and therefore addresses indirectly the scalability requirements by distributing the effort required to compute event notification and propagate it over the network. Performances of other applications such as the FM (e.g., ability to analyse, collect and forward alarms in a timely way) also impact the notification service scalability.

Page 22 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Finally, the OLSR service implements the ad-hoc routing protocol OLSR (Optimised Link

State Routing) that provides multi-hop message routing semantics. In other words, OLSR provides a network routing infrastructure composed of nodes (i.e., routers or host) that dynamically establish multi hop routing among one another so as to form a mesh network in which out of range nodes can communicate with each other via intermediate nodes. Nodes, which enter and leave, are monitored by OLSR implicitly and in turn by the notification service. OLSR is based on scalable diffusion algorithm that significantly reduces the number of transmissions required to diffuse a control message by defining a set of pre-selected nodes called MPR (multi point relay) that retransmit diffused messages. The selection of

MPRs is performed so as to (i) keep to a minimum the number of MPRs and (ii) reduce the number of retransmissions required to diffuse a message in the network. To conclude, we may argue that the OLSR protocol is suitable for large and dense networks 1 since it is based on efficient diffusion algorithm (relying on MPRs) that reduces the traffic generated to control the topology updates. As a result, when the density/number of nodes increases, the traffic generated is expected to increase slowly.

3.4.1 Evaluation Scenario

The scenario we are considering to assess the scalability of the CM is mostly a networking scenario. There are further reasons that acknowledge this fact:

The OLSR, notification, directory and grouping services are mostly characterised by the fact that they are some communicating services. Note that this is due to the fact that they constitute a corner stone used to achieve network configuration and management in a distributed way in the Madeira system.

The cost relating to communication is the major parameters that could affect the scalability of the considered service; the computation load being negligible comparing to this latter.

Consequently, the scalability assessment of those services will focus on the communication cost. As mentioned in the scenario Description of the Madeira Platform [SCEN], the practical scenario defined by the Madeira actors intends to demonstrate the Madeira functionalities on a small scale on real, available, and inexpensive network equipment; but this document also points out that it is primordial to simulate the Madeira System over large-scale networks.

Note that, it is unrealistic to envision a mesh network that would be composed of more than few tens of base stations since such network is characterised by low bandwidth and suffer from interference (low QOS guarantees would be in this case guaranteed to users). For this reason, the scalability of the CM platform should be rather evaluated over a wired core network potentially composed of hundreds of base stations and thousands of users.

The scenarios we are considering for assessing the scalability of the Madeira platform are increasing number of nodes belonging to the Madeira network as well as the network surface 2 and varying network connectivity in the form of a ratio of nodes that appear/disappear.

Considering the financial cost of deploying the Madeira system over a large-scale test bed, the best way for assessing the notification, directory and grouping service scalability lies in relying on a simulator. This choice is also motivated by the fact that:

1 The network density is defined as the number of nodes per area.

2 The network surface corresponds to the area on which the monitored nodes are deployed.

Page 23 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Simulators offer a flexible environment to define the circumstances of the Madeira system evaluation (different type of networks, i.e., from small-scale to large-scale network, various underlying topologies, various network types, e.g., wireless, wired networks).

Contrary to emulation tools, it offers the possibility to simulate in a more realistic way the characteristics of the links that interconnect the devices (e.g., bandwidth capacity, error rate) and the underlying protocol (MAC protocol, TCP, UDP).

3.4.2 Evaluation Criteria

The main goal of the scalability assessment lies in acknowledging the fact that:

The OLSR, notification, directory and grouping services do not overuse the available bandwidth.

There is no formation of a bottleneck over a single node or a subset of nodes. Note that bottleneck is identified by the fact that one node (or one subset of nodes) is overloaded or one link (or a subset of links) is over used.

The information relating to the topology and provided by the OLSR, notification, directory and grouping services do not necessitate an excessive storage capacity.

Consequently, the evaluated services will be studied under the following two metrics:

The traffic generated by the evaluated services, e.g., number of messages used to have a picture of the network topology (directory service), messages generated to group networked nodes (grouping service).

The delay needed by the services to be aware of any topology update (e.g., node failure), delay necessary to form a group and to update the group structure when confronted to topology changes.

The amount of resources required for storing the topology information that are published and collected by the OLSR, notification, directory and grouping services.

Note that their maximum, minimum, and average value on each node will identify the delay and traffic. The results are expected to verify that the traffic (i.e., number of messages propagated) and delay are quite stable, i.e., evolve in a polynomial way rather that an exponential way when the number of nodes increases or when the connectivity diminishes.

3.4.3 Expected Results

Considering the directory, grouping notification and OLSR service, it appears that the main parameter that may reduce the scalability is the routing of event notifications. One plausible way to reduce the traffic generated by event notification dissemination consists in routing notifications in an intelligent way. Briefly speaking, it lies in filtering the event notifications prior to propagate it to the cluster heads that belong to upper or lower layer; any event notification being forwarded to a cluster head only if a client located towards that direction, is interested in receiving this notification.

Page 24 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

3.5 Fault Management

17/04/2020

MAD-WP7-0001-02-0

The FM Application is responsible for collecting and formatting alarms from a variety of alarm sources, analysing and correlating alarms iteratively on each node, and forwarding them within the logical hierarchy of the Madeira overlay network (on the top-level cluster head the final alarm is passed to the NBI for northbound reporting).

3.5.1 Scenario

As the FM application is an integral part of the Madeira NMS, and as FM related scenes involving real and seeming faults (e.g. outage, shutdown or moving of nodes) will be part of scalability investigations of the overall system, it is essential to include an FM specific view on scalability which has two aspects:

Scalability of the underlying Madeira services used by FM (platform services, e.g. the

Notification Service, AMC services, PBMS), imposing a certain load model of notifications to be processed by those components.

Scalability of the FM application design itself, based on certain assumptions on FM scenes in large networks.

In accordance with our general approach to investigate scalability, we will determine the behaviour (mainly by means of measuring performance related characteristics) in dependence of various parameters of the system (mainly the system size and the alarm load

– where the latter itself depends on a variety of other parameters) that will be described

briefly in chapter 3.5.2. As a result of such investigations we expect to be able to precisely

predict the overall performance for all possibly relevant network configurations and parameter values, or – the other way around – to specify the optimal configuration (and parameter tuning) for a certain scenario.

In order to gain this knowledge it is necessary to analyse scalability using different methods.

In principle the following methods could be relevant:

Theoretical study o Analyse impact of different network topologies and various modelling assumptions (fault models, alarming models, grouping algorithm, etc.) on the number of alarms and total traffic generated o Use results to detect relevant and interesting scenarios for simulation o Compare results to the one gained by other methods (especially simulation)

Simulation o On top of platform level simulations (Ericsson) o Depending on parameters and fault models focusing on investigation of communication related issues

Emulation o Might be useful for small scale tests only o preliminary tests with XEN setup have been partly successful (compare

chapter 5.2.1)

o due to performance problems we currently do not plan to use XEN.

Live test bed (testing on “real“ nodes) o 1 or few nodes (Siemens test-bed): specific tests regarding processing (alarm load etc.) as mentioned above

Page 25 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0 o Up to around 20 nodes: useful for alarming load tests, supplementing the results of simulations (note that this might only possible during project workshops) o Test bed interconnection between various sites (this might be dropped)

Large test bed (Planetlab etc., including virtualization) o Useful for verifying simulation results on a larger scale (100 – 500 nodes) o Question of significance of (performance related) results is open o Note that it has been decided to drop this activity (refer to chapter 5)

Amongst the methods above it seems rational to start by accomplishing a theoretical analysis dedicated to determine the expected dependencies between different type of faults and generated network traffic and processing load for different network topologies. Afterwards, its results can be used to guide simulations. Parallel to these main tasks, additional effort might be spent on further investigating the other methods depending on needs, in particular on further analysing processing issues by means of a small test bed.

3.5.1.1. Expected Bottlenecks

One definite bottleneck that has already been identified in previous tests is the limitation of alarm rate that can be processed on a single node. With respect to reasons for this, we might distinguish between three different levels: a) Implementation related performance problems regarding alarm correlation that have been identified and several short-term corrections have been implemented

Database backend: (de-) serializing alarm objects

Logging within PBMS (to Swing GUI)

Further small improvements regarding AuditTrail, NCS-CE and NCS-AC (merging) b) More conceptual issues regarding PBMS and how it is used by the FM application.

These issues include:

Prioritised Condition Evaluation

Execution strategies in case of complex sets of policies

After further discussions between FM and PBMS responsible it has been decided that currently no improvements are going to be implemented, as they are not considered to be worth the effort. c) Remaining performance limitations on Madeira nodes: Even if implementation has been optimised and the main bottlenecks have been eliminated, we should expect that the maximum alarm rate that can be processed continuously will not be considerably more than 1 alarm per second, assuming modest HW resources for a typical Madeira node.

According to c), one might argue that the FM application might always face a scalability issue

- if each network fault would lead to an alarm reported to the operator, and hence be processed by the top-level cluster head (unless this top-level cluster head would be a very powerful machine which would then again come closer to a static centralised solution...). A possible answer to that could be that additional alarm consolidation might be performed dynamically on intermediate cluster levels (e.g. alarms on various communication link outages are compressed into reports on the quality of communication links in a certain network area, if a threshold has been reached).

Page 26 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Further possible bottlenecks might be due to:

Physical network topology (especially if the network topology is near to a situation where all nodes are one hop neighbours of all other nodes)

Too many members per cluster (will increase load on cluster head)

Many correlation rules (which might be required for a complex heterogeneous system)

3.5.2 Evaluation Criteria

We plan to evaluate scalability of the FM application depended on two different types of parameters. First of all one can define so called primary parameters which are external inputs derived from a certain concrete scenario, i.e. they don’t depend on Madeira internal configuration or tuning. These primary parameters are:

Network size: number of nodes

Physical network topology: average number of next-hop neighbours

Average failure rate (for various failure types of NEs or network links)

Similarly, we define secondary parameters that depend on the aforementioned, primary ones and/or on Madeira internal configuration or tuning:

Alarm rate on a Madeira node (depending on cluster level)

Maximum cluster size (grouping algorithm)

Number of cluster levels

Number of correlation rules (PBMS conditions) (might depend on cluster level)

Fault & alarming models and assumptions (e.g. how many alarms per fault)

FM specific timings (correlation, analysis)

To evaluate scalability issues of the FM application the following metrics (performance observable) need to be measured during tests and simulations with different values of primary as well as secondary parameters:

Overall alarm delay (detection of fault – reporting to operator)

Alarm throughput (might depend on cluster level)

Traffic generated by FM application (per network link)

Memory/storage/processing characteristics (per node)

Note that the dependency of local metrics (delay, throughput, resource consumption) on some of the secondary parameters (e.g. alarm rate, number of correlation rules) can very well be studied locally on a single node. For other investigations, different approaches are needed.

3.5.3 Expected Results

Considering the traffic generated, we expect the FM application to generate only modest traffic for each fault. As with increasing network size the amount of faults per time will typically rise too, we additionally expect traffic generated due to multiple faults to be distributed throughout the network, i.e. not overusing single links or network elements. In terms of different network topologies it is obvious that more traffic is generated for dense networks (i.e. each node has many next hop neighbours). However we expect the FM application to scale well even for those networks, as a large fraction of this traffic is likely to be reduced already on the lowest levels of the logical hierarchy due to alarm correlation.

The detailed quantitative dependencies between faults, different network topologies and the traffic generated of other performance relevant characteristics will be worked out by a theoretical study. Nevertheless a few additional expectations can already be made:

Page 27 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

In general the performance characteristics should not depend on the number of nodes, since when enlarging the network size at the same time the resources for network management are increased accordingly. Of course there are a few factors to be considered as exceptions of this rule: o Delay depends on number of cluster levels (that in turn depends on network size). o The overall alarm throughput is limited by the resources in the top-level cluster head if no additional consolidation is performed on intermediate levels

The performance characteristics should not depend on the average fault rate in the network, up to a certain threshold rate. It should then be possible to increase this threshold rate by either changing system parameters (e.g. new correlation rules for compressing of alarm information) or enforce the HW of certain nodes.

Dependency on other (secondary) parameters like the maximum cluster size is expected to exhibit a local maximum / minimum that can then be used to optimise the system configuration.

Page 28 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

4 Scalability Definition

In telecommunication and software engineering, scalability is a desirable property of a system or a network, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged [1]. If we consider the scalability of the Madeira system from a communication point of view, we could define scalability as the capability of the Madeira system to support communication throughput when confronted to an expanding number of NE to manage. Considering the scalability from a computational point of view, it may refer to its ability to support an increased load on resources when NEs are added.

However, considering the number of components and services that form the Madeira

System, scalability, as a property of the overall Madeira system, is difficult to define precisely. For this reason, in Section 4.1 and 4.2, we propose to define the meaning of scalability considering separately the services that are composing the Madeira system, and along two dimensions: from a communication point of view and from a computational point of view.

In section 3 we have presented the scalability scenarios and requirements per component.

As it can be seen, the definition of scalability depends on the scalability of individual components. Moreover, we can say that it depends on the expected performance of the component in the context of the whole Madeira Management System. After studying deeper what has been proposed in previous section, we can conclude that scalability, from the

Celtic-Madeira point of view, can be divided in two big groups, which are presented in the following sub-sections.

4.1 Communication

We define two intuitive measures of goodness to determine the capacity of the Madeira system to scale from a communication point of view. There refer to:

Traffic overhead that may be defined by the number of message generated and received by each node. The traffic generated and received by each member should be kept to a minimum so as to guarantee that the Madeira system may face an increasing number of managed NEs.

System stress intends to capture the ability of the system to share the traffic load among NEs and thus to avoid the formation of bottlenecks. Practically, the maximum, minimum, average value and standard deviation define the system stress when considering the traffic generated and received by each node and each link. These values will be provided after making the simulations.

If we consider more specifically the necessary functionalities to configure the mesh network, the aforementioned measures of goodness are used to determine either or not:

The configuration of the mesh network is scalable. It consists in proving that the configuration of the mesh network to allow multi-hop communication and offer connectivity to out of range node, is likely scalable. It also consists in assessing that the creation and maintenance of the O&M overlay on top of which management tasks are achieved and message are routed, is achieved in a scalable way.

The monitoring of the network is scalable. It consists in proving that the reading and monitoring of data and events coming from lower level services is achieved in a scalable way and that the publishing of events relating e.g., topological information to higher level services is achieved in a scalable way.

Page 29 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

Various options exist and may be used to assess the scalability of the Madeira system from a communication point of view. Those include a simulation and emulation-based studies of and theoretical analysis. Among this different option, it appears that simulation represents a convenient way it provides a fine-grained representation of the data exchanges produced by the Madeira system.

In terms of NBI scalability, the following criteria are identified in the Madeira context:

Number of received alarms by the FM and notifications from the CM, the NBI has to serve all subscribed OSSs their notifications of these events.

Number of OSSs subscribed to the NBI notification service, how well does the notification service scale when many OSSs are subscribed and have to be served.

Number of Web Service requests to the NBI, how much resources do the WS requests take and thus how many OSSs can be associated with the NBI or at which rate they can do requests.

Alarm size in numbers of fields in the message and the overall size of the alarm message in bytes.

These criteria can be measured with dummy set-ups, simulations, test-beds and profiling tools, and would result in an overview of the identified bottlenecks. Based on this overview suggestions could be made to increase the performance and thus scalability of the NBI.

4.2 Computation

As it can be extracted from the scalability definition per component, the most critical performance issues might be due to computational resources. For example, if we focus on the PBMS, we can see that for this component is more important the number of policies that can be processed and evaluated than the amount of information that can be transferred to other components (communication issues).

Taking this into account, we can say that is important to take the computational resources as part of the scalability study in the project Madeira. When talking about scalability in terms of computation, we can define it as the ability of the system to perform well, in terms of nodes processing abilities, when the system starts growing. In other words, the study of scalability in terms of computation will focus on the performance of the Madeira Management System and its components when the number of nodes that need to be managed grows.

As the system grows, it can be seen that the number of alarms and notifications that need to be managed will grow. This growing will affect the computational resources related to their processing: policy evaluation and enforcement, alarm correlation, alarm publication, etc. In order to study the performance of the system in this situation, we have agreed to make use of alarm generators in order to emulate the existence of a big network without the need of having it. Also related with the number of alarms, the NBI has to serve all subscribed OSSs their notifications of these events.

Another important issue that is affected by the growing of the number of nodes is the inventory of the network. The more nodes the network has, the more time it will consume to process the topology by the NBI when the OSS requests the topology via the Web Service. It is also important to note the possible bottleneck at the Cluster Head when reporting alarms.

It is also important to note that there is another scenario that is not actually related with the number of nodes in the system. Actually, this scenario is related with the number of policies in the system. It is possible to grow the number of policies that need to be processed without having to increase the number of nodes.

Page 30 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

5 Analysis Tools

In the previous section of this document, we have been talking about scalability in the context of the project Celtic-Madeira. First of all, we have talked from a component point of view to finish talking in a more general way. Moreover, we have also proposed how to make the study of scalability in every of the sections. Depending on the component, it is advisable to make use of simulators, emulators or large-scale test beds.

In the following sections, you can find the description of all the tools that have been proposed to be used, independently if we will actually use or not grouped by simulators (Section 5.1), emulators (Section 5.2) or large-scale test beds (section 5.3)

5.1 Simulation Tools

5.1.1 Introduction

We are using simulation as a method for assessing the scalability of the Madeira system. A variety of simulators are nowadays available. In order to select the appropriate simulator, the simulators have to be compared and investigated with regard to some concerns relating to the design and functioning of the Madeira platform. One example of such concern refers to the ability of a given simulator to support the scenario envisioned by the Madeira project

[MAD-WP4]. These scenarios require that the selected simulation tool supports the simulation of wireless networks. Another non-technical concern one should pay attention relates to the licence (open source versus proprietary) of the simulator. Based on these

concerns (see Section 5.1.2 for a detailed list of these concerns), we investigate the

characteristics of the JSIM and NS simulators (Section 5.1.3). We finally conclude this

investigation with a detailed description of the JSIM (Section 5.1.4) and NS simulator 5.1.5).

5.1.2 Requirements and Concerns

When selecting a simulator to assess the scalability of the Madeira platform, one has to pay attention to mainly three categories of concerns:

A.

B.

C.

The implementation of the simulator and the Madeira system.

The simulator usability.

The networking model promoted by the simulator.

In the following Sections, we discuss each of them in turn.

5.1.2.1. Implementation of the simulator and Madeira system

If we consider the implementation of the Madeira platform, two crucial constraints have to be taken into consideration:

The operating systems (Linux, Windows) under which the Madeira system has been deployed and developed and on top of which the simulation should be driven.

The language on which the Madeira System is developed and therefore the language in which the simulation should be implemented.

These constraints impact on the selection of the simulator, since the simulator should be deployable on the above mentioned operating systems and provide support for the languages in which the Madeira system have been implemented.

Page 31 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

One should pay attention to these two dual requirements that come from the language used to implement the Madeira platform. Indeed, the language that has been used to develop the

Madeira system is mostly developed using Java language (http://java.sun.com). However, a part of the Madeira system corresponding to the implementation for the OLSR protocol has been developed in the C++ language. Note that this design choice is lead by the high performance level (e.g., low processing delay) that is required by such a routing protocol.

This multiplicity of development language (C++ and Java) imposes that either the simulator provides a specific support for these languages or different simulators are used separately to assess the Java-based and C++ based Madeira components. Note that we selected the second options. Indeed the scalability assessment of the OLSR protocol is conducted with a

C++ -based simulator whereas the assessment of the rest of the Madeira platform is achieved using a Java-based simulator.

When implementing on the simulator, we must consider two important aspects that affect the level of complexity involved with this task. These are:

 The degree of achievement and maturity of the implementation.

 The clarity and understandable character of the API (Application Programming

Interface) provided to implement applications over the simulator.

These aspects are crucial because although several general-purpose network simulation packages have been released in the public domain, a great majority of them are still under development and do not provide the features necessary to simulate a complex application or protocol. Consequently, some of the proposed simulators provide unclear and undocumented API that would need additional development effort to be clarified.

5.1.2.2. Usability

The usability refers to the ability of the simulator to provide functionalities that can ease the carrying out of the simulation. Examples of such functionalities include:

1. A script language and its corresponding environment for handling the creation of the simulation scenario and the launch of the simulation.

2. Some facilities integrated in the simulator to easily produce trace files, statistics, and graphs.

3. A well detailed documentation that entails tutorials.

5.1.2.3. Networking model

The degree of abstraction supported by the simulator when describing the network topology and networked elements coupled with the level of granularity allowed to depict the messages, protocol stack and communication facilities constitute the two main components that define the accuracy of the simulator.

Note that, unrealistic simulations are typically based on some simplifying assumptions concerning the network topology and end host. Such example assumptions refer to the fact that:

1. Links have constant latency and cost of computation at site is a linear function of the load,

2. Links are characterised by a constant latency.

Page 32 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

A consequence of these assumptions is that the network model does not trigger the effect of congestion. In order to capture in a fine-grained way in which network components interact with one other at the packet level, one has to employ detailed network models that characterize packets activity as in the real system.

5.1.3 Simulator Selection

We selected both the J-SIM and NS simulators because they exhibited a number of desirable characteristics that will be described in the following section.

We summarize all these characteristics in Table 6 and provide a detail description of the

JSIM and NS simulator in Section 5.1.4 and 5.1.5. Briefly sketched, the main parameters that conduct us to this choice are that:

1. The implementation of both simulators resulted from an important and continuous development and research effort. Indeed, JSIM is the result of a constant cooperation of several institutions and organisations that include Cisco System Inc.,

Ohio State University, and University of Illinois and Urbana-Champaign, since 1999.

In counterpart, NS was introduced in 1989 and results from a development effort carried by the VINT project [VINT] (LBL, Xerox PARC, UCB, and USC/ISI) in collaboration with ACIRI, UCB Daedelus and CMU Monarch projects and Sun

Microsystems. In addition, NS and JSIM are both open source software.

2. Each simulator leads to a full, stable and well-documented implementation.

3. Each simulator entails a number of functionalities (graphical interface, scripting language, graph generator) that ease the carrying of a simulation.

4. The network model provides is a fine grained representation of the networking related information e.g., network topology which includes a link parameters (capacity, type of links, error rate), as well as a complete representation and implementation of the protocol layers, e.g., MAC layer, application layers, coupled with available implementation of protocols such as UDP or TCP.

5.1.4 JSIM Simulator

5.1.4.1. Simulator Implementation

The JSIM simulator is an open-source simulator that has been developed in the JAVA language and on top of component-based software architecture called ACA (Autonomous

Component Architecture). Due to the above design choices (autonomous component architecture coupled with used of the Java language), J-SIM implementation possesses several desirable features. Indeed, JSIM is a truly platform-neutral, extensible, and reusable environment. Consequently, JSIM may be deployed on some operating systems such as

Windows or Linux operating system. Another advantage induced by this design choice is that different components can be independently developed (on different platforms and/or different programming languages) and integrated later in an easy way.

Page 33 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

Concerns Sub concerns NS

Simulator

Implementation

Operating system supported

17/04/2020

MAD-WP7-0001-02-0

JSIM

Unix-based operating system (FreeBSD,

Linux, SunOS,

Solaris) and

Windows by relying on Cygwin [cygwin].

Platform independent simulator that provide support for e.g., Unix based operating systems and

Windows.

Usability

Development language

Scripting environment

Facilities for generating graphs

Documentation

Tutorial

API

Support

C++.

TCL.

Tracegraphe

[NSTrace].

Detailed documentation including tutorial,

API, architecture and description a mailing list.

Java.

JavaTCL script and graphical scripting editor.

Graph generator integrated in the implementation.

Detailed documentation including tutorial, javadoc, architecture description, and mailing list

Licence Licence type (open source versus proprietary)

Open source Open source

Table 6. Characteristics of the JSIM and NS Simulator

5.1.4.2. Simulator usability

JSIM encompasses graphical and simulation utilities to support visual simulations. These utilities include a scripting interpreter coupled with graphical interface for describing and running the simulation, as well as a graph generator coupled with a graph display interface to generate and show the obtained graph. In the following, we provide a short description of each illustrated with Figures.

JSIM provides a TCL-based scripting language for describing the simulation scenario. For

this purpose JSIM uses JACL [JACL] implementation. As illustrated in Figure 1, JSIM also

provides a graphical interface that helps users when running and testing their simulations. In addition, J-SIM provides a convenient graphical environment for displaying the statistics resulting from a simulation. In the script that defines the simulation scenario, users may define which parameters (e.g., which queue) should have its performances exhibited in a

graph. Monitored parameters are then displayed on some graphics, as shown in Figure 2.

Finally, extensive documentation effort has been provided to increase the usability of the

SJIM simulator. We provide here a non exhaustive list of documentation:

Page 34 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

A JSIM tutorial ( http://www.j-sim.org/tutorial/jsim_tutorial.html

)

Description of the JSIM Network Architecture (http://www.jsim.org/whitepapers/ns.html)

Examples for writing ACA components (http://www.j-sim.org/guide/cwg.html)

Example for writing JSIM components (http://www.j-sim.org/drcl.inet/ex_echoer.html)

A description and user guide for using the JSIM scripts ( http://www.jsim.org/geditor/v0.5/ )

A description of the main packages composing the JSIM architecture ( http://www.jsim.org/drcl.inet/index.html

). More detailed description of specific packages (e.g., sensor package, wireless package, differentiated service framework commonly called

DIFFSERV) is also available.

A summary of the basic TCL commands that are frequently used to create a network scenario in the JSIM simulator. Note that extensive documents (tutorial, API, user guideline) are available for from the JCAL website

(http://tcljava.sourceforge.net/docs/website/ common) and TCL web site (www.tcl.tk).

A user mailing list http://lists.cs.uiuc.edu/pipermail/j-sim-users/

5.1.4.3. Networking Model

The networking model of JSIM includes a generic node structure (either an end host or a router) and a generic network components, both of which can then be used as base classes to implement protocols across various layers. Precisely, the network model is based on a generalized packet switched network model. This model is composed of basic, abstract components extracted from the Internet, and hence the name Internetworking Simulation

Platform (INET). Although the model is derived from the Internet, it is general enough to serve for accommodating wireless networks.

5.1.5 NS Simulator

5.1.5.1. Simulator Implementation

NS is a discrete event simulator that is particularly popular and thus commonly used to drive simulation of protocols. It is an open source object-oriented simulator written in the C++ language and based on the TCL interpreter OTCL [OTCL] which is an object oriented extension of TCL. Note that this interpreter is used to execute user's command scripts. In counterpart, the C++ implementation intends to provide a rich library of network and protocols objects.

5.1.5.2. Simulator usability

Practically, to setup a simulation network, one should write an OTCL script that initiates an event scheduler, set up the network topology using the network objects and the setup functions provided by the NS library, and tell traffic sources when to start and stop transmitting packets. Then one should run the simulator on this OTCL script. NS then carries the simulation and produces one or more text-based output files that contain detailed simulation data. Those data can be visualized thanks to a graphical simulation display tool called NAM (Network AniMator) [NAM] that is developed as a part of VINT project. NAM is a

TCL/TK based animation tool that provides a user friendly graphical user interface for visualizing network simulation traces, viewing network topology, and packet level animation.

In addition NAM provides various data inspection tools. For instance, NAM may graphically present information such as throughput and number of packet drops at each link.

In addition, NS provides detailed documentation concerning:

NS Installation guide: http://www.isi.edu/nsnam/ns/ns-build.html

NS tutorial: http://www.isi.edu/nsnam/ns/tutorial/index.html

5th VINT/NS simulator tutorial workshop: http://www.isi.edu/nsnam/ns/nstutorial/ucb-tutorial.html

Page 35 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

NS user guide for beginners: http://wwwsop.inria.fr/maestro/personnel/Eitan.Altman/COURS-NS/n3.pdf

NS documentation: http://www.isi.edu/nsnam/ns/ns-documentation.html

NS manual page: http://www.isi.edu/nsnam/ns/ns-man.html

NS class hierarchy: http://www-sop.inria.fr/rodeo/personnel/Antoine.Clerget/ns

NS mailing list: http://www.isi.edu/nsnam/ns/ns-lists.html

Tcl/Tk quick reference guide: http://www.slac.stanford.edu/~raines/tkref.html

OTCL tutorial (Berkeley Version): http://bmrc.berkeley.edu/research/cmt/cmtdoc/otcl

OTCL tutorial (MIT Version): ftp://ftp.tns.lcs.mit.edu/pub/otcl/README.html

FAQ and user manual on NAM: http://www.isi.edu/nsnam/nam/

5.1.5.3. Network Modelling

NS were originally designed for simulating wired network. However, NS has been extended to support wireless networks, mobile ad hoc networks or event sensor networks. The NS simulator supports many network protocols, multicasting and MAC protocols over wireless and wired networks. It offers modelling of nodes, links, and protocols as well as a detailed model of the radio frequency characteristics (e.g., interferences, fading), protocol interactions

(burst errors on links, dropped packets).

5.1.6 Conclusion

Simulation represents a convenient way of assessing the scalability of the Madeira platform from a communication point of view. Indeed, it provides a fine-grained representation of the data exchanges produced by the Madeira platform. Actually, a number of simulators are available and could therefore be used to handle the Madeira platform. Among others, we select the J-SIM and NS simulators for their desirable characteristics. Briefly speaking, each simulator is the result of an important research and development effort that has lead to a full, stable and well-documented open-source implementation. Each simulator entails a number of functionalities (graphical interface, scripting language, graph generator) that will help on carrying simulation. The choice of selecting two simulators was dictated by the two languages (C++ and Java) in which is developed the Madeira platform. Precisely, the NS

(respectively JSIM) simulator intends to be used for assessing the OLSR protocol

(respectively, the rest of the Madeira platform).

5.2 Emulation Tools

5.2.1 XEN

5.2.1.1. Introduction

Xen [XEN] is a virtual machine monitor able to securely execute multiple virtual machines, each running its own Operating System, on a single physical system that is stated to have good (close to native) performance [XEN-Wiki]. It is open source and released under terms of the GNU General Public Licence.

Nowadays XEN seems to evolve to a tool widely used by a growing community exchanging information and providing help in case of problems. Its progress is also validated by the fact that it is going to be integrated in popular Linux distributions such as Suse 10.1.

For Madeira, it could be used to test scenarios consisting of much more nodes as we did so far, as it reduced the number of needed physical hosts.

Page 36 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

5.2.1.2. Preliminary Results

17/04/2020

MAD-WP7-0001-02-0

The results of first experiments with Xen 3.0 under Ubuntu 5.10 [XEN-Guide] are ambiguous.

On the one hand it was quite easy to set up everything and run several instances of Madeira on a single physical system but on the other hand we faced severe performance problems as well as some not reproducible problems.

From a performance point of view we observed unacceptable response times within the event handlers of the Directory and Grouping Service leading to an unstable configuration of the logical overlay. Additional to that, OLSR sometimes does not start correctly within an OS on a virtual host and Madeir a instances on physical hosts sometimes didn’t find their next hop neighbours running on a virtual host and vice versa. However both problems didn’t occur always and we do not have a final explanation yet for this behaviour. One possible explanation that might be valid at least for performance relevant issues is that the overhead due to process switches gets too large due to the large amount of threads used by each

Madeira instance.

5.2.1.3. Conclusion

Unless we are able to solve the problems described above XEN is not useful to emulate a larger test bed. Therefore, it is planned to investigate the behaviour or XEN under Suse 10.1 that comes out of the box with XEN installed, so possible configuration faults we are not aware of can be prevented. Additionally it might be interesting to analyse the behaviour on more powerful physical machines than currently available.

5.2.2 Alarm Generators

In most of the (real-node) tests for investigation of processing issues, we can confine ourselves to very small test beds (1-2 nodes), if we are using tools for generating the required load on our system.

In case of FM related investigations, a simple alarm generator will be used that is able to generate a defined sequence of alarms locally. The interacting with the running Madeira software happens by means of a telnet interface, the corresponding alarm generator code is part of the FM application. Parameters that can be specified are:

Number of alarms (can also be specified as infinite)

Interval between alarms (in milliseconds)

If alarms should be cleared (and in that case the corresponding sequence).

5.3 Large Test-Beds

5.3.1 PlanetLab

PlanetLab is a collection of academic, industrial, and government institutions co-operating to support and enhance the PlanetLab overlay network. The PlanetLab Consortium is managed by:

Princeton University

University of California at Berkeley

University of Washington

Currently, Princeton University is hosting the consortium.

PlanetLab consists of computational resources hosted by organizations that donate time, rack-space, and network connectivity for the good of the community. In order to make easier

Page 37 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0 to understand how big PlanetLab is, we can show some figures. PlanetLab is approximately composed by:

690 machines

334 sites

25 countries

PlanetLab allows experimental services to run continuously. It supports network measurements experiments. However, it is important to remark that users have to adhere to widely accepted standards of network etiquette in order to minimize complaints from network administrators.

When a user joins PlanetLab, two nodes should be provided to the network. These nodes should have IP connectivity, including a single static IP address and a DNS name. They must be placed outside the local firewall and allow PlanetLab operations team to administer the node. Apart from the hardware, the user should also contribute with some money to the

Consortium, except if the user is an academic institution.

As a consequence of joining the PlanetLab network, every user will have right to a number of slices (accounts) depending on the type of user. In order to access the devices, the user should use existing security mechanisms, as ssh. It is important to remark that the access to the node is non-root in order to avoid hacking attempts of the PlanetLab nodes.

Once the partner is making use of the resources provided by the Consortium, it should be known that PlanetLab provides absolutely no privacy guarantees with regard to packets sent to/from slices. Users should assume packets would be monitored and logged to allow other users to investigate abuse. So, if there is any concern regarding confidentiality in the software that is going to be installed in the nodes, it should be considered first this lack of confidentiality. Other important issues regarding Celtic-Madeira are that the use for research and educational purposes is allowed. Of course, PlanetLab should not be used for any illegal or commercial activities.

Some other issues that can affect the Madeira software are:

It is not allowed to use your PlanetLab slice (account) to gain access to any hosting site resources that you did not already have.

It is not allowed to use one or more PlanetLab nodes to flood a site with so much traffic as to interfere with its normal operation. Use congestion controlled flows for large transfers.

It is not allowed to do systematic or random port or address block scans. Do not spoof or sniff traffic.

Summarising, PlanetLab is an overlay test bed designed to allow researchers to experiment with network applications and services that benefit from distribution across a wide geographic area. All uses of PlanetLab should be consistent with this high-level goal.

PlanetLab is designed to support both short-running experiments and continuously running services. No users other than the PlanetLab operations team have root access to PlanetLab nodes. This is an important drawback, as the Madeira software needs to be executed as root. Moreover, it is also important that PlanetLab allows us to run OLSR on their network.

5.4 Conclusion

After having studied all the previous solutions in order to make the scalability study, and having in mind the different scenarios proposed in the previous sections, we have conclude that we are going to make use of simulation tools as well as test-tools in order to generate alarms and notification for the 1-node test-beds.

From the scenarios description, it can be extracted that we are not going to make use of a large-scale test bed, as we can emulate it by some pieces of software that can generate

Page 38 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0 traps, alarms and notifications. When we need to evaluate the performance of a unique node, we don’t really need to establish a large test-bed in order to see how it will act when the number of notifications and alarms grow. It is possible, because the software that needs to be evaluated is not communicating with any other node at the moment of making the evaluation of these parameters. Moreover, we also need to study the overall performance of the Madeira System. In order to address that, we need to set up a test-bed with several nodes, although it would be not necessary to be a large-scale test-bed.

Regarding the emulators like XEN, after having tested them, we have conclude that they alter the performance of the node, when trying to run more than one node in a physical node.

So, we have decided not to make use of them, as the obtained results wouldn’t be accurate enough.

We can conclude that in order to make the scalability study, we are going to make use of the simulation tools in order to study the part more related to communications. It has also been decided that we will make use of some tools that will emulate some scenarios, as a high number of notifications being received by a node, etc.

Page 39 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

Appendix

6.1 Screen Captures of Simulation Tools

17/04/2020

MAD-WP7-0001-02-0

Figure 1 Scripting Editor

Figure 1 displays a screen capture of the scripting editor provided by the JSIM. In the left part

of the figure is shown the command line editor. Any user willing to simulate an application can use it to execute the commands that are necessary to describe the simulation scenario and launch the simulation. If the user does not want to use the command editor to describe the simulation scenario and launch the simulation, he can also select a file (see the right part of the figure) that includes all the information necessary to do so.

Figure 2 JSIM Graphic editor

Figure 2 displays a screen shot of the graphical editor provided by JSIM.

Page 40 of 41

MADEIRA PROJECT CONFIDENTIAL

Scalability Study Regarding Project Madeira. Selection of Tools

17/04/2020

MAD-WP7-0001-02-0

When a user is launching a simulation, he can use the graphic editors to display some performance parameters (e.g.., number of packet exchanged by the simulated nodes). The

two windows located on the foreground of Figure 2 represent such graphic editor.

The windows in the background correspond to the scripting editor that has been used to launch the simulation.

Figure 3 NAM graphic Editor

Figure 3 provides a screen capture of the NAM editor. Precisely, this editor provides some information concerning a simulation that is currently performed. The simulation scenario considered incurs two nodes (a sender and a receiver) that are exchanging some messages

Note that the user has the possibility to stop, accelerate, and replay some part of the simulation by just clicking on the buttons located on the top of the window. In addition, by using the file menu, any user can select a simulation scenario and start the simulation.

Page 41 of 41

Download