Dependability Assessment of an OGSA Compliant Middleware Implementation by Fault Injection

advertisement

Dependability Assessment of an OGSA Compliant Middleware Implementation by Fault Injection

Nik Looker and Jie Xu

Department of Computer Science,

University of Durham, DH1 3LE, UK

{n.e.looker, jie.xu} @ durham.ac.uk

Abstract

This paper presents our research on applying our dependability assessment method to an OGSA compliant middleware product. Our initial proof of concept experiment was implemented using a stateless Tomcat web server and Apache SOAP. This research adapts and enhances our existing fault injection software (OGSA-FIT) from the stateless environment of a standard web service to the stateful environment of an OGSA Toolkit (Globus).

We compare our initial proof of concept experiment to our new target system based on the Globus

Toolkit. The Globus Toolkit is implemented around an Apache Tomcat server using the Axis SOAP library as well as OGSA interfaces and libraries.

We address issues arising from latencies introduced into the system by OGSA-FIT. We introduce a model for calculating this latency and introduce new mechanisms into the software to reduce this. We also present results from our initial experiments, which showed a problem with an alpha version of the

Globus Toolkit.

We detail future research including plans to enhance our user GUI to provide semi-automatic test campaign generation. Also since our OGSA-FIT software is intended to support both OGSA based middleware as well as standard SOAP based web-service environments we outline our research into providing interchangeable personality modules.

Keywords: GRID computing, GRID Middleware, OGSA, SOAP, software fault injection, Globus, Fault

Model

1 Introduction

GRID computing , as a means of solving large scale scientific problems, requires dependable middleware to facilitate interaction between various nodes in a GRID system and to reduce the level of complexity presented to the application programmer when designing and implementing a program to solve a problem. OGSA (Open Grid

Service Architecture) [1] is such a system and is currently the front running technology used in

GRID computing [2-4]. As such it requires detailed metrics on its dependability. This is required not only to uncover existing problems with the system but to also provide potential users with metrics to compare its dependability to other systems.

Fault injection is a well-proven method of assessing the dependability of a system [5] and our initial proof of concept experiments, based around a SOAP based web service system, have already yielded significant results [6].

Our method uses a modified version of

Network Level Fault Injection to inject faults into the network protocol stack before signing/encryption has taken place. Since signing/encryption protocols are intended to stop tampering with messages. If these measures weren’t taken the recipient would reject the modified packets and the intended fault class would not be tested.

Our set of software tools OGSA-FIT (OGSA

Fault Injection Technology) is a GUI based implementation of the command line tools developed for the experiments conducted in [6].

The tools have been considerably enhanced to allow the semi-automatic generation of test scripts from WSDL interface definitions as well as a graphical method for defining triggers.

2 OGSA-FIT Design

OGSA-FIT consists of five elements: 1) fault injector framework; 2) hooks into the SOAP protocol stack to pass SOAP messages through the fault injector; and 3) a user script to inject faults into the message stream, 4) a set of software modules to allow the generation of skeleton user tests, and 5) a GUI to allow easy user interaction with the system.

Figure 1 depicts how a typical RPC would be processed by the system: An RPC call is sent to the SOAP API via the OGSA API (1). A SOAP message is forwarded via the socket to the fault injector (2). If a fault is to be injected the message is processed by script (3) and new message is sent back to the SOAP API via the socket (5). If no fault is to be injected the message is sent back unaltered (4). The SOAP API then transmits the message (6) to the SOAP API of the Server machine and this message is decoded and sent to the service via the OGSA API (7). After step (2) the SOAP message may or may not contain a fault dependent on whether the trigger fires. The process could be repeated by the system in the same way with the response message from the server to the client (this is shown with shadow lines on the diagram).

Client

1

5 4

2

6

Trigger

3

Script

Results

7

Service

Log

Figure 1: Interception of an RPC call

2.1

Fault Injector Framework

Since OGSA messages are encoded in an XML based format, each message is parsed and reemitted to inject a fault. This takes place before any message signing and encryption has been done. The fault injector uses a SAX parser and a two stage parse process to process each message.

The first pass acts as a trigger to determine if a message is to have a fault injected into it. The second pass only executes if the trigger pass indicates that a fault needs to be injected. This is done as a means of speeding up the overall throughput of the fault injector, since injecting the fault is relatively time consuming.

The injector is written as a separate process to the software under test with the hook code communication with the fault injector.

Communication is facilitated by a TCP/IP socket connection, with the fault injector process acting as a server.

The injector is structured in this way for three reasons: 1) it simplifies the design of the injector software. The injector software only has to be implemented once, with only a small amount of hook code needing to be implemented for each host platform/language, 2) since the fault injector is a separate process it is possible to off load this processing onto a separate machine, and 3) it should be possible for many nodes to use the same instance of the fault injector and this will allow multi-node experiments to be co-ordinated.

In addition to injecting the faults, the fault injector framework also monitors the stream of messages for generated faults. These fault conditions are recorded in a log file, along with notifications of the injected faults. This log file is stored in XML format so it can be easily analysed.

The fault injector is written in Python 2.1 as a series of classes and a global function to execute the user test script. The Python language was chosen to implement the test harness because it is a scripting language that provides the following: 1) it provides object oriented facilities; 2) it has a comprehensive set of XML processing classes and string processing functions; and 3) since it is self compiling it allows easy modification to scripts, whilst providing the speed benefit of p-code execution.

The initial framework developed in [6] was executed standalone by a Python interpreter running under Linux. OGSA-FIT is written in Java so that it is as platform independent as possible. To allow the Python scripting language to be retained within OGSA-FIT the Jython package [7] was used to allow python code to be executed from the

Java JVM.

2.2

SOAP API Modifications

The SOAP library modifications consist of a small quantity of hook code inserted into the library. One hook for outgoing message and another for incoming messages. The hook code consists of opening a socket to the fault injector server, sending the SOAP packet to it, and receiving the modified packet back. This modified packet then replaces the packet constructed by the

SOAP library.

2.3

User Test Scripts

User test scripts are written as three derived

classes that the fault injector creates instances of too test for, inject and record faults. The fault injector framework creates a new instance of both the injection based class and handler based class for each message processed.

A results class is created once and is in scope for the entire lifetime of the test run so it can record such metrics as tag statistics.

A trigger class is derived from the class inject and must define some methods. start , body and end are called by the framework to process the tag start, tag body and tag end. isInject is called to ascertain if a fault should be injected into a message. isEndOfRun is called to determine if the fault injector should terminate. By combining these methods with data obtained from the result class, fault triggers can be defined. The base class also contains a method expectFault which instructs the fault injector framework to scan the next message and check to see if it is a fault packet and log the results of this scan to the log file.

An inject class is derived to inject a fault into a message. The fault class gets its state from the result class, which is in turn set from the inject class (which will be out of scope by the time the inject class runs). The inject class works in a similar way to the trigger class with start, body and end methods but these methods are used to actually inject a fault into the message.

2.4

OGSA-FIT GUI

A GUI has been implemented in Java to allow easy user interaction with OGSA-FIT. Easy user interaction is critical if the tool is to be widely distributed and used by users who are not initially familiar with fault injection techniques. It is hoped that future research will allow OGSA-FIT to be enhanced to a degree where automatic script generation and result analysis is possible.

A new feature of OGSA-FIT is a rudimentary test script creation facility. Script are constructed as a visual tree be parsing the WSDL definitions output by Gloubus 3.0 [1]. The visual tree can then become a basis for generating tests manually.

Once all the tests are defined for a particular middleware component a test script can be generated from the tree and saved as a Python file.

OGSA-FIT incorporates the Jython package to allow its test scripts to be executed. It was felt the

Python scripts should be retained from the initial experiments rather than implement them in Java because, 1) a scripting language was required, rather than a programming language like Java, and

2) there is a relatively easy way to unload Jython classes from the system, where as native Java makes this very difficult.

OGSA-FIT is based around a dynamically modifiable menu system, which is closely linked to fault injector script generation. OGSA-FIT is designed to scan a number of standard directories.

It then constructs certain menus within the system from the contents of these directories. This allows the test scripts to be added dynamically to the menus as they are created. It is envisaged that users could also use this facility to extend and customise OGSA-FIT for their own needs and add facilities such as custom analysis scripts, etc.

3 Comparison of Proof of Concept

Environment with Globus 3.0

The purpose of our initial proof of concept experiment was threefold: 1) to analyse differences between techniques used to characterise tightly coupled, RPC based distributed systems and GRID based distributed systems, 2) to define a method to characterise GRID middleware and 3) to apply this method to a key component of OGSA.

Initial experiments were conducted using a web-service based system because, at the time of the experiments, Globus 3.0 hadn’t been released as an alpha. We therefore constructed a simulated

OGSA environment composed of Apache Tomcat version 4, Apache SOAP 2.2 and Java 1.3.1

running on a RedHat Linux 8.0 installation. This system allowed us to emulate a simple SOAP based transaction but without any of the intricacies of OGSA statefulness or security. It also allowed us the opportunity to perfect our fault injection framework in a simple environment before applying it to the much more complex environment of an OGSA system.

We intend to maintain and enhance the compatibility with standard web-services so that

OGSA-FIT can be used to assess web-service middleware dependability as well as OGSA middleware dependability.

Since our test bed system was not implemented using Globus we must analyze differences between the two systems to: 1) determine if the method can be successfully applied to Globus, and 2) determine which areas of the fault injection framework will require modification.

Our experiments are based on Globus 3.0 alpha

3 on two PCs running RedHat 9.0 Linux.

3.1

Stateless Environment

As stated above our initial experiments were conducted using a standard web-server. Standard web-servers are inherently stateless in nature. Any data that is held is usually shared between all clients with no distinction between clients.

Some web-services maintain state by transmitting a context id along with the SOAP packet. This method is also used by some classical distributed systems, i.e. CORBA and DCOM [8,

9].This id defines which context should be loaded and used when processing a request from a client, thus allowing a separate state to be maintained for each client (or group of clients). States may be maintained in physical memory, i.e. RAM or some sort of backing store, i.e. disk storage or a database.

OGSA implements a similar context id system.

Each service instance is defined by a Global

Service Handle (GSH) which is a unique identifier used to identify an endpoint to an instance of a service.

The GHR is used in the same way as a context id within a SOAP packet. By tracking the GHR of each SOAP packet against a database of GSH a specific instance of a GRID service can be tracked and targeted.

Since the software written in [6] was written as a number of derived python classes, it is our intension to write a set of derived classes to fit in between the user script and the existing fault injection framework classes that would provide this functionality. It is also our intension to write a set of classes to facilitate more detailed testing of traditional web service SOAP RPC calls.

By providing these classes based on the extensible nature of our framework we have the facility to change the ‘personality’ of the software and thus test different protocols. In the future more

‘personality modules’ could be written to allow more detailed tracking of SOAP RPC calls.

3.2

SOAP Implementation

Our initial experiments were based on a system running Apache SOAP 2.2. This was chosen because of its stability and the authors familiarity with this implementation.

Apache SOAP is a tried and tested SOAP implementation but it does suffer from a number of deficiencies. The first is performance when compared to other XML based transfer techniques

[10]. A secondary concern is the extendibility of

Apache SOAP due to its rigid structure and close integration with the HTTP transport protocol [11].

Apache SOAP is based around a DOM tree representation of a SOAP packet. Data is added to the DOM tree as required and then the DOM tree representation is parsed and output as a string to the transport. Since this conversion to a string is done in one specific place in the code it is relatively easy to instrument the protocol stack.

Similarly there is a single point to instrument the protocol stack for an incoming packet just before the packet is parsed into the DOM tree.

Apache Axis [12] has been designed to supersede Apache SOAP and overcome the two main deficiencies of the old implementation [11].

Firstly it has been designed to utilize a SAX parser to construct packets from incoming data steams.

Secondly it has been implemented to allow its easy integration with different transport protocols, not just HTTP. As a by-product of this Axis has been designed to allow plug-in modules to be inserted into its protocol stack to alter messages as the SAX parser parses them.

Although the new Axis implementation improves performance and allows greater flexibility when integrating with different transport protocols it does posse some challenges with instrumenting.

Since a SAX parser is used to parse the SOAP message, both incoming and outgoing, there isn’t a single point in the code at which to instrument the protocol stack since the SOAP message is only constructed into a string to send to the transport once all the signing and encryption modules have been run on it. Conversely, when a SOAP message is received it is signed and encrypted when it is contained in a string, at the point where it is decrypted it has already been parsed and is contained in a number of different data structured within the SOAP stack.

An interim solution to the problem is to instrument the actual code that does the sending and receiving of SOAP messages. These points can be found in the classes HTTPSender (for transmitting a message) and SOAPPart (for receiving a message). This method has the advantage that it is easy to implement and the complete message is constituted into a string which can be passed to OGSA-FIT in the same way as in the original experiment [6]. This method is a good first step in porting OGSA-FIT to the

Globus 3.0 environment since it allows the framework to be quickly tested but since all messages are intercepted at a point after they have

been signed and encrypted it will only work if these facilities are not used. It also has the disadvantage of tying OGSA-FIT to a specific transport protocol, i.e. HTTP.

Our permanent solution to this problem is to write a module to plug into the protocol stack. This module will use the same plug-in technology that

Security module and WS-Encryption module. This module must be positioned in the calling chain at a point before (or after for received messages) WS-

Security and WS-Encryption modules are executed. This new module must gather the data stored in the various data structures held within the protocol stack and pass this information to OGSA-

FIT and receive the possibly modified data back from OGSA-FIT so that it can be rewritten into the protocol stack before normal processing of the message continues.

4 Latency Model

In [6] we noted that the fault injector would introduce a latency into the transmission of SOAP

† latency is assumed to be small it could become significant in long duration experiments because its effects could be cumulative and it could distort the results of time sensitive applications. Here we

† and some simple steps for minimizing its effects.

Figure 2 describes the various terms that make up the total latency within the system.

Client

Tt f

Tt nf

Tt n

T ea

T tt

Trigger T et

Service

Script T es

Results T er

Log

T el

Figure 2: Fault Injection Latencies

From these we can make some assumptions about timing. Firstly we shall assume that the terms T et

and T es

will execute in approximately the same time since they both run a SAX parsers over the same XML source. Secondly we will assume that all network transfers will take approximately the same time to execute. These terms are shown in Equation 1.

Equation 1: Approximations

T et

ª T es

= T p

T tt

ª T tnf

ª T tn

ª T tf

= T t

From the system described in Figure 2 there are three different latencies that can be derived: 1)

Inject No Fault (Equation 2), 2) Inject A Null

Message (Equation 3), and 3) Inject A Fault

(Equation 4).

Equation 2: Inject No Fault

T inf

= T es

+ T tt

+ T et

+ T tnf

+ T el

= T es

+ 2 T t

+ T p

+ T el

Equation 3: Inject Null Message

T in

= T es

+ T tt

+ T et

+ T tnf

+ T el

= T es

+ 2 T t

+ T p

+ T el

Equation 4: Inject A Fault

T if

= T es

+ T tt

+ T et

+ T es

+ T er

+ T el

+ T tf

= T es

+ 2 T t

+ 2 T p

+ T er

+ T el

Since T inf

and T in

are the same we can extract this as a common term in Equation 5 and we can use this to reduce Equation 4 to Equation 6

Equation 5: Common Term

T com

= T es

+ 2 T t

+ T p

+ T el

Equation 6: Reduced Inject Fault

T if

= T com

+ T p

+ T er

From this we can see that the difference between injecting a fault and the other two classes of fault is just the time taken to process the user script and the time spent interacting with the result class.

4.1

Comparison of T in

and T if

Timing

This experiment was designed to verify the relationship derived in Equation 6. This was done by injecting a timed sequence of fault messages and then a timed sequence of null messages into the test system and comparing the average time taken to process each.

The SOAP API was instrumented with a timer at a point just before the hook code and the timer was terminated just after the hook code. A normal duration timer was used to provide the measurement, which included any system overhead due to time multitasking, etc. This was felt to be justifiable since the model includes

several network transfers to different machines so exact process time measurement isn’t possible.

Using a large sample size for the test will average out any irregularities introduced into the measurements.

The code tested was a modified version of the

CounterService and CounterClient supplied as part of Globus 3.0. The CounterClient was modified so that it contained a loop around the call to the add service method to allow a large number of repetitions for the test.

The test was performed for T if

by injecting simple fault messages (modified return values).

Whilst this was performed the system appeared to function normally. The results are given in Figure

3. This gave an average latency of 185 m . The high standard deviation in this experiment is due to network delays (see Figure 4).

Mean

Std Dev

Max

Min

Median

1st Quartile

All Write Read

(micro sec) (micro sec) (micro sec)

185 78 291

221 33 272

695

16

65

58

235

60

65

62

695

16

294

18

Figure 3: Total Latency of no fault packets

An attempt was then made to perform the same experiment for T in

by running the same test code but discarding messages. At this point a bug was discovered in the Globus 3.0 code, which prevented the experiment from running to completion.

A brief analysis of the code suggests that the problem lies in assumptions made by the JAX-

RPC mechanism, and a small flaw in the Axis

SOAP code. When the JAX-RPC mechanism performs a request/response sequence it makes the assumption that a response will always be sent by the reciprocal JAX-RPC library on the server. If this is not the case the Axis SOAP code has no timeout mechanism in place to cope with a discarded message. The class HTTPSender will block in the readHeadersFromSocket method listening for a connection from the server indefinitely, waiting for the server to connect and send a response. The situation could happen if a client sent a request and the server crashed before a response was transmitted.

Since verification of Equation 5 and Equation 6 isn’t currently possible we shall assume the assumption made in Equation 5 is correct to allow us to verify T if

.

4.2

Detailed Verification of T if

This experiment was designed to provide more detailed data for the timing model for T if

. For this experiment we will determine timings for T tt

T tf same (T t

and

to ascertain that they are approximately the

). The experiment will also determine the latency introduced by T es

and T et

to determine if they are approximately the same (T p

). Finally an overall timing will be taken including network latencies and SOAP hook code latencies. Since all network latencies and SOAP hook code latencies are the same as for the experiment conducted in section 4.1 they can be eliminated from the equations to give the required terms.

This experiment will use duration timers to determine latencies with variations introduced by system load being averaged over the sample set.

We are using these timers in place of process time timers because we are interested in the overall throughput of the system and any real world delay introduced.

The overall delay will be measurement using the same measurement points as used in section

4.1. This will be T if

. The term T et

will be measured by measuring the call to the SAX parser and the subsequent code to bypass the second stage parser for Null and No Fault messages. The term T es

will be measured by measuring the second stage parser used to inject faults. The term T el

will be measured using a cumulative method and reset every iteration round the processing loop since the logging function is called from various points in the code.

The results from this experiment are given in

Figure 4 and Figure 5. By far the largest term is T t with a mean value of 61 m sec. This is to be expected since the network transfer is relatively slow and has a high timing variability. This correlates well with the figure for T if

Figure 3.

given in

Mean

Std Dev

Min

Max

Median

1st Quartile

3rd Quartile

61

103

5

3

0

0

Ttt + Tet +

Tes + +Ter +

Ttt Tet + Tel Tes Ttf +2Tel

(micro sec) (micro sec) (micro sec) (micro sec)

127

0

366

3

1

73

2

177

8

2

7

0

0

8

0

0

Figure 4: Detailed Timing of Tif

Both T et

and T es

are relatively small in comparison (approximately 10% of T tt

). In this experiment the fault injected into the SOAP packet was performed using a single if statement and so the time taken to execute T es

was correspondingly

† small. In more complex fault injection scripts the value of T es

would increase in proportion to the number of if statements and string handling done.

A further set of timing was undertaken for T el which was cumulatively measured in this

, experiment so the combined value of 2T calculated. T el

was el

was calculated with a mean value of

30 m sec. A proportionally large time was expected for this value since it performs a write to disk. The large standard deviation can be assumed to be the result of buffering in the disk routines in the operating system (since about half the log routines in the raw data gave a timing of 0 sec).

Mean

Std Dev

Min

Max

Median

1st Quartile

3rd Quartile

2Tel

(micro sec)

61

103

0

330

5

0

122

Figure 5: Log Timing

4.3

Refinement of Equation

From these experiments we can conclude that only network transfers and log operations are significant in calculating the latency of an injected fault using this method. A simplified equation is given in Equation 7.

Equation 7: Simplified Model

T com

= 2 T t

+ T el

T in

= T ino

= T com

T if

= T com

+ T el

Given that this is the case, an average fault on the system used to perform these tests would be

182 m sec, which compares favourably with the overall timing given in Figure 3.

Given this figure we can estimate how long a particular system would have to execute before the cumulative effect of using the fault injector would become noticeable. If we assume that for a given service operation, 4 SOAP packets are intercepted and a fault injected into 2 of them with the rest being passed through unaltered, then there would be a cumulative delay of approximately 2T

2T ino if

+

every operation which works out to 0.68 sec.

This delay could become significant so timing constraints in a system being tested using OGSA-

FIT may have to be increased if services were running continuously.

5 Latency Reduction

As shown in Section 4 the latencies that effects the operation of a system being assessed using

OGSA-FIT are network transfers and logging. In line with this, we have modified our system architecture to minimize these effects.

In our original proof of concept system we instrumented both the SOAP API of the client and also the SOAP API of the server. This was done to give us the ability to manipulate packets at both ends of the communication system.

This had the disadvantages of: 1) the service container had to be modified as well as the client software. This effectively means modifying the software that you are testing, 2) the time take to perform the fault injection for a request and response to a service was doubled.

In this modified architecture we have removed the need to instrument both client and service container (although this can be done if the need arises). Our initial experiments indicate that this approach can be as effective as our original method.

Another modification we have undertaken to the architecture is to allow the fault injector to have a special control pathway that allows the generation of null packets without the need to run the second stage of the parsing process. Since our initial experiment to determine the timing for this uncovered a bug in the middleware it remains to be proven if this enhancement will significantly improve overall performance of the system.

6 Conclusions

The main intent of this paper was to present our research into a timing model for our OGSA-FIT tools when used in conjunction with Globus 3.0

middleware. The timing model presented in this paper has shown that a significant cumulative timing latency could be introduced into a system under test. Steps have been taken to reduce this with a modified system architecture.

The initial experiments performed on Globus uncovered a problem with the middleware. This could become a problem if a server crashed before returning a response packet in response to a request packet.

This paper also details our research into integrating our OGSA-FIT tools with the Globus

3.0 toolkit. This has involved inserting SOAP hooks into the Axis SOAP library used by the

Globus system. We have implemented an

intermediate solution to implementing hooks, which has allowed us to easily port our existing framework to a Globus environment. We propose a system of inserting hooks into the SOAP library that will allow security plug-ins to be enabled.

Lastly we propose an extensible system of personality modules that will allow OGSA-FIT to track statefull OGSA objects and thus allow fault injection to be better targeted at a specific instance of and OGSA object.

Finally this paper discusses our OGSA-FIT

GUI. This GUI enhances our existing framework and allows easy user interaction with a fault injection campaign. The GUI allows semiautomatic fault injection campaigns by parsing the

WSDL definitions generated by a Globus build.

The GUI also integrates the python based fault injection framework into the Java GUI by use of the Jython package.

7 Future Research

Further work is required to enhance this method and provide easy to use tools to allow fault injection testing.

Work will be undertaken to enhancing the existing fault injector to accommodate robustness testing techniques. This will be done by using the existing framework to modify API parameters encapsulated in a SOAP message and thus achieving the aim of corrupting data at the API interface. This may be integrated with the automatic script generation facility to allow users to easily test new general components as well as applying this to specific middleware components.

Another area of research will be to enhance the existing framework to track the stateful nature of

GT3 services and allow this information to be used to target fault injection at specific services and instances of services using the method given in this paper.

The method will also have to be enhanced to test for signing and encryption faults, although it is envisaged that this will be preformed by some other mechanism than the fault injector framework since this method carefully avoids this problem.

8 Acknowledgements

This work was supported by the EPSRC IBHIS project and the EPSRC/DTI e-Demand project.

9 References

[1] I. Foster, C. Kesselman, J. M. Nick, and

S. Tuecke, "The Physiology of the Grid:

An Open Grid Services Architecture for

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Distributed Systems Integration,"

Argonne National Laboratory 2002.

G. Allen, T. Dramlitsch, I. Foster, N.

Karonis, M. Ripeanu, E. Seidel, and B.

Toonen, "Supporting Efficient Execution in Heterogeneous Distributed Computing

Environments," presented at Proceedings of SC, 2001.

K. Keahey, T. Fredian, Q. Peng, D. P.

Schissel, M. Thompson, I. Foster, M.

Greenwald, and D. McCune,

"Computational Grids in action: the

National Fusion Collaboratory," Future

Generations Computer Systems , vol. 18, pp. 1005-1015, 2002.

M. Russell, G. Allen, G. Daues, I. Foster,

E. Seidel, J. Novotny, J. Shalf, and G. von

Laszewski, "The Astrophysics Simulation

Collaboratory: A Science Portal Enabling

Community Software Development,"

Cluster Computing , vol. 5, pp. 297-304,

2002.

M. R. Lyu, Software fault tolerance .

Chichester ; New York: John Wiley,

1995.

N. Looker and J. Xu, "Assessing the

Dependability of OGSA Middleware by

Fault Injection," University of Durham,

Durham, Technical Report 01/03,

27/04/03 2003.

"Overview of Jython Documentation,"

2003, http://www.jython.org/docs/index.html

"CORBA Component Model," V3.0 ed:

Object Managment Group, 2002, http://www.omg.org/cgibin/doc?formal/02-06-65

"Distributed Component Object Model

(DCOM) Binary Protocol," Microsoft

Corporation, 1997, http://www.mircosoft.com/com/tech/dco m.asp

P. Hrastnik, "Comparison of Distributed

System Technologies for E-Business," presented at 2nd International

Interdisciplinary Conference on

Electronic Commerce (ECOM-02),

Gdansk, Poland, 2002.

L. Denanot, "Introducing Axis," 2002, http://www.techmetrix.com/trendmarkers/ publi.php?C=22OFO

"Axis Architecture Guide," 1.1 ed:

Apache Axis Group, 2003, http://ws.apache.org/axis/

Download