Using Grid Services From Behind A Firewall

advertisement

Using Grid Services From Behind A Firewall

A.L. Rowland *, M. Burns

, J.V Hajnal,* D. Rueckert

, D.L.G Hill

* Imaging Sciences Dept, Imperial College London

‡ Dept. of Computing, Imperial College London

† Centre for Medical Image Computing, University College London

Abstract

Grid services promise the power of the supercomputer to the average desktop user. However, as with all forms of distributed computing, they rely on the sending and receiving of messages between components for the system to work. The rapid rise in attacks by malicious users has meant that firewalls have become a way of life for any machine permanently connected to the internet. Opening external firewall ports involves a degree of trust which, even to ‘friendly’ networks, is still unacceptable for many institutions.

This paper discusses the security concerns and describes a system for submitting complex workflows to a remote grid service for execution and retrieving the corresponding output, from a location situated behind a firewall with no open inbound ports without the need for any low level user interaction.

Introduction

Data protection has become a serious issue in the 21 st century with all manner of personal details now stored on computer systems throughout the world - the onus being on the institution storing the data with regard to managing access. Nowhere is this more critical than in the field of medical imaging where patient confidentiality is paramount. Typically medical institutions will deny access to all of their resources from any locations outside of their domain.

At the same time, recent advances in medical imaging have resulted in a deluge of digital data that requires vast amounts of computing power to process. Grid technology appears an attractive solution to this problem with features such as encrypted data transfer and certificate based authentication methods being integral to the technology. However, as with all distributed systems, a degree of unhindered bidirectional communication is still required

3

.

Network Security

From a general point of view, network security threats fit into one of the following categories

5

:

Attempts to read or modify confidential data

Denial of service – preventing genuine users from accessing services

Using local network resources to launch attacks elsewhere

Whilst these are all important, the first of these is often perceived as being of most concern.

Denial of service attacks differ from the others in that they are aimed bringing down a service rather than actually gaining access and so fall into their own separate category.

So does opening a port on a firewall constitute a threat to network security ?

Network administrators will argue that a network is only as secure as its weakest link. Logically therefore, a firewall with open inbound ports is more at risk than one with all ports blocked.

However, in reality the risk depends on the quality of the server software listening on that port. To actually gain access to a resource a malicious user requires:

A server process listening on a specific port

That port to be accessible from a remote location

That server process to have a flaw which will allow it to be compromised

The majority of security exploits in recent years have been of the “buffer overflow” type

8

where a malicious user can exploit a design flaw in the server program by which to gain control.

Buffer overflows usually occurs in programs written in

C. In the C language arrays are not size-aware and overfilling an array will cause the excess data to overwrite whatever comes after it in memory. If this memory happens to be on the call stack then it’s possible for the user to supply a small program followed by a fake return address. When the flawed method eventually returns it will read the fake return address, jump straight to the malicious code and run it.

A significant wake up call for the grid community occurred in 2003

6

involving Grid FTP – part of the

Globus toolkit. Grid FTP, a well used and trusted service, was discovered to be vulnerable to a buffer overflow attack. Grid FTP had been built on top of the popular WU-FTPD server and unfortunately inherited many of its flaws.

The majority of e-science projects permit Grid FTP though their firewalls due to its apparently secure

X.509 authentication mechanism. At the time though, this trust would have been misplaced - the grid FTP daemon runs as ‘root’ on Unix systems and would therefore have given an external attacker system-wide access to just about everything on the network.

Needless to say, this bug has since been fixed.

So is the idea of blocking all inbound ports safe?

Most private networks block inbound ports but permit unhindered outbound connections to the internet.

Traffic returning via the internally initiated connection is then permitted through the firewall without question.

However a client application can be exploited in exactly the same way as that described previously for server processes. A poorly designed client application running on a user’s desktop machine could just as easily contain a buffer overflow bug that allowed the client machine to be compromised. Indeed Microsoft’s flagship application Outlook was discovered to be susceptible to just such a flaw

7

if the date header of an e-mail was set to an invalid and oversized value. The official Microsoft security bulletin states:

If the affected field were filled with carefullycrafted data, the e-mail client could be made to run code of the malicious user's choice.

Quite a serious concern considering that the e-mail did not even have to be opened for this to occur – only for the headers to be available on the mail server.

This suggests that the policy of no open inbound ports is actually a false hope. Whilst it does restrict the number of potential attacks, the risk is never eliminated entirely. It does however make security easier to administer – network administrators would be extremely busy if both inbound and outbound ports were routinely blocked and many services would simply cease to function.

Firewalls and Distributed Systems

It is a fairly typical scenario for those attempting to deploy grid services to be in a different group to those responsible for network security. Whilst in most cases the two can agree to a compromise and free up the required ports for communication

3,9

, the end result will often differ from one organisation to another and most likely depend on the sensitivity of the data stored inside the firewall.

From the authors’ point of view, being located in a large medical institution means that we have hard-andfast rules on firewalls which are non-negotiable – the policy states that no inbound ports may be opened.

This inflexible situation forced us to devise ways of getting our services to work reliably by other means.

Firewall traversal techniques have always been a little contrived and more often than not the solution is specific to the particular domain where the problem occurs. Our solution probably also fits into this category although the principles used could be applied in a variety of other situations.

More radical proposals have been suggested to combat firewall issues in distributed computing. One such study which stands out in the literature is the CODO

4 system which proposes the use of a dynamic firewall where access rules are initially minimal but can be added on-the-fly. In this a system a firewall rule is added if both client and server can be authenticated using X.509 credentials. Any such on-demand rule is short lived and will expire once traffic on that port has been inactive for a relatively short period of time. Such a technique sounds plausible, however it would be difficult to sell the idea to a network administrator since the power of ultimate control is taken away from the administrators themselves.

Challenges

In a previous publication

1

the group developed a

Workflow Service, running as an OGSI grid service, which interpreted high level workflows and coordinated their execution at multiple locations on the grid. Whilst it is possible to submit a workflow to this service from behind a firewall, the secure location of the input files referenced in the workflows mean that input data is inaccessible and jobs fail to run.

The challenge was therefore to find a way to utilise the existing Workflow Service for submitting jobs from within a department with an impenetrable firewall (no open inbound ports) without compromising the network security of either location.

Problems include:

Data anonymity

Medical image files are often tagged with patient details meaning they cannot be stored in a publicly accessible location

Data access

Strict network security - firewalls block all inbound ports.

Externally sourced file transfer requests will therefore fail.

Data retrieval

Individual tasks are delegated to grid nodes at multiple locations.

Tracking down and retrieving output files is difficult and time consuming.

An existing 3 rd

party data retrieval service performs this task but cannot operate through firewalls.

Whilst it is possible to submit simple Globus jobs and transfer files from behind a firewall, this requires detailed knowledge of the Globus toolkit and the manual transfer of files to and from other locations on the grid. Our aim in this project was to make the use of grid services transparent to our end users - a group of medical researchers who are predominantly nontechnical users of compute resources. Thus this level of involvement was not considered appropriate. Instead our users work at the more abstract level of the workflow leaving the low level grid protocols and data transfer to be handled by our own middleware.

Approach

The approach taken involved one hardware and one software modification to our infrastructure:

A demilitarized zone (DMZ) for data storage of anonymised image data

An additional tier in the submission process - a locally hosted Job Monitoring Service

In the previous incarnation of the IXI system

2

the user submitted a workflow directly to the Workflow Service which then coordinated and delegated tasks to other locations across the grid. These other locations would then request input files as specified in the workflow.

However, if all input files were behind a firewall, jobs would time-out waiting for file transfer and the workflow would fail.

Under the new system, the user now submits their workflow to a Job Monitoring Service which resides on their local network. This service does several things:

Anonymises the images by removing patient identifiers from the file headers

Moves the anonymised image data to a DMZ where it can be accessed directly from trusted grid nodes.

Invokes the remote Workflow Service as before, passing the workflow and delegated user credentials

Fig. 1 – System Overview

The Job Monitoring Service will then periodically query the Workflow Service for the status of the workflow which is stored as service data. Normally this would be handled by OGSI notifications using a ‘push’ method from the Workflow Service, however the firewall will block all such notifications and so a periodic poll is required to check the service data for each task.

With typical tasks taking many minutes of even hours to run, the polling interval can be relatively long which translates to negligible overhead for the Job Monitoring

Service.

As each task’s status changes to ‘DONE’ the results

(output files, standard output, standard error and provenance) are then pulled back via grid FTP to the user’s own network using delegated grid credentials.

This saves considerable time and also avoids users from having to locate where each file is on the grid and transfer them back themselves.

Finally the Job Monitoring Service structures the workflow output files into a more human-readable directory layout for easier browsing and notifies the user via e-mail that their workflow has completed and where the results can be found.

From the user’s point of view the files always appear to be local. There is no need for the user to know anything about ‘the grid’, Globus or indeed where there files were sent for processing.

Conclusion

Firewalls have become a necessary evil in the world of today yet are a major inconvenience to all forms of distributed computing. This paper discusses some of the general security issues facing anyone involved in distributed computing and describes a method of accessing OGSI grid services and reliable file transfer from behind an impenetrable firewall. Compromises in design, most notably polling for status changes rather than receiving notifications, are offset by reliability and usability. The result is a high level, workable system which provides full access to the services without compromising internal network security. Future changes to network security may require alternative schemes to be devised to enable grid-computing to be used in conjunction with protected data sources.

References

[1] M. Burns, A.L. Rowland, J.V. Hajnal,

D. Rueckert, D.L.G. Hill

A Grid Infrastructure for Image Segmentation

UK e-Science All Hands Meeting 2004.

[2] A.L. Rowland, T. Hartkens, M. Burns,

J.V. Hajnal, D. Rueckert, D.L.G. Hill

Information eXtraction from Images (IXI):

Image Processing Workflows Using A Grid

Enabled Image Database

Proceedings of Distributed Databases in Medical

Image Computing - MICCAI 2004,

[3] M.A. Baker, H. Ong, G. Smith

A Report on Experiences Operating the Globus

Toolkit though a Firewall

DSG Technical Report 2001.01, September 2001 http://dsg.port.ac.uk/projects/grid-security/docs/ globus-firewall-experiences.pdf

[4] S. Son, B. Allcock, M. Livny

CODO: Firewall Traversal by Cooperative

On-Demand Opening

Proceedings of the 14th IEEE Symposium on

High Performance Distributed Computing,

A Rough Guide to Grid Security.

UK e-Science Technical Report Series

(UKeS-2002-05) http:// eprints.ecs.soton.ac.uk/7286

[6] Globus security warning for Grid FTP http://www-unix.globus.org/mail_archive

/security-announce/Archive/msg00011.html

[7] Security Bulletin - Microsoft Outlook date header http://www.microsoft.com/technet/security/

bulletin/MS00-043.mspx

Countermeasures http://www.linuxjournal.com/article/6701

[9] Globus toolkit firewall requirements http://www-fp.globus.org/security/

v2.0/firewalls.html

Download